Database Schema: E-Commerce Design
Welcome to TopperBlog! 👋
I'm a tech content creator passionate about helping developers level up their careers and master cutting-edge technologies.
🎯 What I Write About:
• AI/ML Engineering & LLMs
• Web3 & Blockchain Development
• System Design & Architecture
• Interview Preparation (FAANG)
• Freelancing & Remote Work
• Modern Tech Stacks (Next.js, React, Rust, TypeScript)
• Performance Optimization & Best Practices
💼 Mission: Sharing practical, actionable insights that accelerate your tech career and maximize your earning potential.
📚 15+ In-Depth Guides covering everything from earning $10k/month as a freelancer to cracking FAANG interviews.
🌐 Let's connect and grow together in this amazing tech journey!
#TechBlogger #SoftwareEngineering #CareerGrowth #WebDevelopment #AIEngineering
Why Traditional E-Commerce Schemas Fail at Scale
Legacy e-commerce schemas typically follow a straightforward normalized approach: products table, orders table, order_items junction table, customers table. This works perfectly for 10,000 products and 1,000 daily orders. It collapses under modern requirements.
The first failure point is product catalog flexibility. Traditional schemas use fixed columns for product attributes. Adding a new attribute requires schema migrations. Managing products across categories with different attributes (clothing needs sizes and colors, electronics need specifications and warranties) forces either sparse tables with hundreds of nullable columns or complex EAV (Entity-Attribute-Value) patterns that destroy query performance.
The second failure is inventory management across distributed fulfillment. A single inventory count column can't represent stock across 50 warehouses, reserved quantities for pending orders, in-transit stock, and safety stock thresholds. Race conditions during concurrent checkouts cause overselling. Pessimistic locking creates contention that throttles order throughput.
The third failure is pricing complexity. Modern e-commerce requires customer-segment pricing, time-based promotions, bundle discounts, volume pricing, and dynamic pricing from ML models. Storing a single price column and calculating discounts at query time creates performance problems. Pre-calculating every price variant explodes storage and creates consistency nightmares.
The fourth failure is analytical query performance. Business intelligence queries joining orders, products, customers, and inventory across date ranges lock tables and timeout. Separating OLTP and OLAP workloads requires complex CDC pipelines and introduces data freshness delays that break real-time dashboards.
Modern Schema Architecture for E-Commerce Platforms
A production-grade e-commerce schema in 2025 uses a hybrid approach: normalized core transactional tables for consistency, strategic denormalization for read-heavy paths, JSONB for flexible attributes, materialized views for analytics, and partitioning for scale.
Product Catalog Schema with Flexible Attributes
-- Core product table with fixed, queryable attributes
CREATE TABLE products (
product_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
sku VARCHAR(100) UNIQUE NOT NULL,
name VARCHAR(500) NOT NULL,
category_id UUID NOT NULL REFERENCES categories(category_id),
brand_id UUID REFERENCES brands(brand_id),
base_price DECIMAL(10,2) NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'active',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
-- Flexible attributes as JSONB for category-specific fields
attributes JSONB NOT NULL DEFAULT '{}',
-- Denormalized search fields
search_vector tsvector GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
setweight(to_tsvector('english', coalesce(attributes->>'description', '')), 'B')
) STORED
);
-- Indexes for common query patterns
CREATE INDEX idx_products_category ON products(category_id) WHERE status = 'active';
CREATE INDEX idx_products_brand ON products(brand_id) WHERE status = 'active';
CREATE INDEX idx_products_search ON products USING GIN(search_vector);
CREATE INDEX idx_products_attributes ON products USING GIN(attributes jsonb_path_ops);
-- Category-specific attribute validation
CREATE TABLE category_attribute_schemas (
category_id UUID PRIMARY KEY REFERENCES categories(category_id),
required_attributes JSONB NOT NULL,
optional_attributes JSONB NOT NULL,
validation_rules JSONB NOT NULL
);
This schema solves the flexibility problem while maintaining query performance. Fixed columns handle filterable attributes (category, brand, price). JSONB stores category-specific attributes without schema migrations. The generated tsvector column enables full-text search without external services. GIN indexes on JSONB support efficient queries like WHERE attributes @> '{"color": "blue"}'.
Distributed Inventory Management
-- Inventory tracking across multiple locations
CREATE TABLE inventory_locations (
location_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
location_code VARCHAR(50) UNIQUE NOT NULL,
location_type VARCHAR(20) NOT NULL, -- warehouse, store, dropship
is_active BOOLEAN NOT NULL DEFAULT true
);
-- Inventory levels with reservation support
CREATE TABLE inventory (
inventory_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
product_id UUID NOT NULL REFERENCES products(product_id),
location_id UUID NOT NULL REFERENCES inventory_locations(location_id),
quantity_on_hand INTEGER NOT NULL DEFAULT 0,
quantity_reserved INTEGER NOT NULL DEFAULT 0,
quantity_available INTEGER GENERATED ALWAYS AS (quantity_on_hand - quantity_reserved) STORED,
reorder_point INTEGER NOT NULL DEFAULT 0,
reorder_quantity INTEGER NOT NULL DEFAULT 0,
last_counted_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT positive_quantities CHECK (quantity_on_hand >= 0 AND quantity_reserved >= 0),
CONSTRAINT valid_reservation CHECK (quantity_reserved <= quantity_on_hand),
UNIQUE(product_id, location_id)
);
CREATE INDEX idx_inventory_product ON inventory(product_id) WHERE quantity_available > 0;
CREATE INDEX idx_inventory_location ON inventory(location_id);
-- Inventory reservations for order processing
CREATE TABLE inventory_reservations (
reservation_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
inventory_id UUID NOT NULL REFERENCES inventory(inventory_id),
order_id UUID NOT NULL REFERENCES orders(order_id),
quantity INTEGER NOT NULL,
reserved_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
expires_at TIMESTAMPTZ NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'active',
CONSTRAINT positive_quantity CHECK (quantity > 0)
);
CREATE INDEX idx_reservations_expiry ON inventory_reservations(expires_at)
WHERE status = 'active';
This inventory schema prevents overselling through explicit reservation tracking. The quantity_available generated column provides instant availability checks. Reservations have expiration timestamps for abandoned carts. A background job releases expired reservations:
// Inventory reservation service with optimistic locking
interface ReservationRequest {
productId: string;
locationId: string;
quantity: number;
orderId: string;
expiresInMinutes: number;
}
async function reserveInventory(
req: ReservationRequest
): Promise<{ success: boolean; reservationId?: string; error?: string }> {
const client = await pool.connect();
try {
await client.query('BEGIN');
// Lock inventory row for update
const inventoryResult = await client.query(
`SELECT inventory_id, quantity_available, updated_at
FROM inventory
WHERE product_id = $1 AND location_id = $2
FOR UPDATE`,
[req.productId, req.locationId]
);
if (inventoryResult.rows.length === 0) {
throw new Error('Inventory location not found');
}
const inventory = inventoryResult.rows[0];
if (inventory.quantity_available < req.quantity) {
await client.query('ROLLBACK');
return {
success: false,
error: `Insufficient inventory: ${inventory.quantity_available} available`
};
}
// Create reservation
const reservationResult = await client.query(
`INSERT INTO inventory_reservations
(inventory_id, order_id, quantity, expires_at)
VALUES ($1, $2, $3, NOW() + INTERVAL '1 minute' * $4)
RETURNING reservation_id`,
[inventory.inventory_id, req.orderId, req.quantity, req.expiresInMinutes]
);
// Update reserved quantity
await client.query(
`UPDATE inventory
SET quantity_reserved = quantity_reserved + $1,
updated_at = NOW()
WHERE inventory_id = $2`,
[req.quantity, inventory.inventory_id]
);
await client.query('COMMIT');
return {
success: true,
reservationId: reservationResult.rows[0].reservation_id
};
} catch (error) {
await client.query('ROLLBACK');
throw error;
} finally {
client.release();
}
}
Order Management with Complex Workflows
-- Orders table with denormalized customer data for performance
CREATE TABLE orders (
order_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
order_number VARCHAR(50) UNIQUE NOT NULL,
customer_id UUID NOT NULL REFERENCES customers(customer_id),
-- Denormalized customer snapshot at order time
customer_email VARCHAR(255) NOT NULL,
customer_name VARCHAR(255) NOT NULL,
-- Order totals
subtotal DECIMAL(10,2) NOT NULL,
tax_amount DECIMAL(10,2) NOT NULL,
shipping_amount DECIMAL(10,2) NOT NULL,
discount_amount DECIMAL(10,2) NOT NULL DEFAULT 0,
total_amount DECIMAL(10,2) NOT NULL,
-- Status tracking
status VARCHAR(50) NOT NULL DEFAULT 'pending',
payment_status VARCHAR(50) NOT NULL DEFAULT 'pending',
fulfillment_status VARCHAR(50) NOT NULL DEFAULT 'unfulfilled',
-- Addresses as JSONB for flexibility
shipping_address JSONB NOT NULL,
billing_address JSONB NOT NULL,
-- Metadata
channel VARCHAR(50) NOT NULL, -- web, mobile, api, pos
currency_code CHAR(3) NOT NULL DEFAULT 'USD',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT positive_amounts CHECK (
subtotal >= 0 AND tax_amount >= 0 AND
shipping_amount >= 0 AND total_amount >= 0
)
);
-- Partition by month for performance
CREATE TABLE orders_2025_01 PARTITION OF orders
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE INDEX idx_orders_customer ON orders(customer_id, created_at DESC);
CREATE INDEX idx_orders_status ON orders(status, created_at DESC);
CREATE INDEX idx_orders_created ON orders(created_at DESC);
-- Order items with denormalized product data
CREATE TABLE order_items (
order_item_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
order_id UUID NOT NULL REFERENCES orders(order_id) ON DELETE CASCADE,
product_id UUID NOT NULL REFERENCES products(product_id),
-- Product snapshot at order time
sku VARCHAR(100) NOT NULL,
product_name VARCHAR(500) NOT NULL,
product_attributes JSONB,
-- Pricing
unit_price DECIMAL(10,2) NOT NULL,
quantity INTEGER NOT NULL,
discount_amount DECIMAL(10,2) NOT NULL DEFAULT 0,
line_total DECIMAL(10,2) NOT NULL,
-- Fulfillment
fulfillment_location_id UUID REFERENCES inventory_locations(location_id),
fulfillment_status VARCHAR(50) NOT NULL DEFAULT 'pending',
CONSTRAINT positive_values CHECK (
unit_price >= 0 AND quantity > 0 AND line_total >= 0
)
);
CREATE INDEX idx_order_items_order ON order_items(order_id);
CREATE INDEX idx_order_items_product ON order_items(product_id);
The order schema denormalizes customer and product data intentionally. When analyzing historical orders, you need the customer email and product name as they existed at order time, not current values. This prevents broken reports when customers change emails or products are renamed.
Partitioning orders by month keeps query performance consistent as the table grows. Queries filtering by date range only scan relevant partitions. Partition maintenance becomes routine:
// Automated partition management
async function createNextMonthPartition(): Promise<void> {
const nextMonth = new Date();
nextMonth.setMonth(nextMonth.getMonth() + 1);
nextMonth.setDate(1);
const partitionName = `orders_${nextMonth.getFullYear()}_${String(nextMonth.getMonth() + 1).padStart(2, '0')}`;
const endDate = new Date(nextMonth);
endDate.setMonth(endDate.getMonth() + 1);
await pool.query(`
CREATE TABLE IF NOT EXISTS ${partitionName} PARTITION OF orders
FOR VALUES FROM ('${nextMonth.toISOString().split('T')[0]}')
TO ('${endDate.toISOString().split('T')[0]}')
`);
}
Pricing and Promotions Schema
-- Customer segment-based pricing
CREATE TABLE price_lists (
price_list_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
customer_segment VARCHAR(100), -- wholesale, retail, vip, etc.
currency_code CHAR(3) NOT NULL DEFAULT 'USD',
is_active BOOLEAN NOT NULL DEFAULT true,
valid_from TIMESTAMPTZ NOT NULL,
valid_until TIMESTAMPTZ,
priority INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE price_list_items (
price_list_item_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
price_list_id UUID NOT NULL REFERENCES price_lists(price_list_id),
product_id UUID NOT NULL REFERENCES products(product_id),
price DECIMAL(10,2) NOT NULL,
min_quantity INTEGER NOT NULL DEFAULT 1,
UNIQUE(price_list_id, product_id, min_quantity)
);
-- Promotions with flexible rules
CREATE TABLE promotions (
promotion_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
code VARCHAR(50) UNIQUE,
name VARCHAR(255) NOT NULL,
promotion_type VARCHAR(50) NOT NULL, -- percentage, fixed, bogo, bundle
discount_value DECIMAL(10,2),
-- Eligibility rules as JSONB for flexibility
rules JSONB NOT NULL,
-- Usage limits
max_uses INTEGER,
max_uses_per_customer INTEGER,
current_uses INTEGER NOT NULL DEFAULT 0,
valid_from TIMESTAMPTZ NOT NULL,
valid_until TIMESTAMPTZ NOT NULL,
is_active BOOLEAN NOT NULL DEFAULT true
);
CREATE INDEX idx_promotions_code ON promotions(code) WHERE is_active = true;
CREATE INDEX idx_promotions_dates ON promotions(valid_from, valid_until) WHERE is_active = true;
The pricing schema separates base prices from customer-segment pricing and promotions. This enables complex pricing logic without query-time calculations:
interface PriceCalculationContext {
productId: string;
customerId: string;
quantity: number;
promotionCode?: string;
}
async function calculatePrice(ctx: PriceCalculationContext): Promise<{
basePrice: number;
finalPrice: number;
discounts: Array<{ type: string; amount: number }>;
}> {
// Get customer segment
const customer = await pool.query(
'SELECT customer_segment FROM customers WHERE customer_id = $1',
[ctx.customerId]
);
// Get applicable price list
const priceListResult = await pool.query(
`SELECT pli.price
FROM price_list_items pli
JOIN price_lists pl ON pli.price_list_id = pl.price_list_id
WHERE pli.product_id = $1
AND pl.customer_segment = $2
AND pl.is_active = true
AND pl.valid_from <= NOW()
AND (pl.valid_until IS NULL OR pl.valid_until >= NOW())
AND pli.min_quantity <= $3
ORDER BY pl.priority DESC, pli.min_quantity DESC
LIMIT 1`,
[ctx.productId, customer.rows[0].customer_segment, ctx.quantity]
);
const basePrice = priceListResult.rows[0]?.price ||
(await pool.query('SELECT base_price FROM products WHERE product_id = $1',
[ctx.productId])).rows[0].base_price;
let finalPrice = basePrice * ctx.quantity;
const discounts = [];
// Apply promotion if provided
if (ctx.promotionCode) {
const promotion = await pool.query(
`SELECT promotion_id, promotion_type, discount_value, rules
FROM promotions
WHERE code = $1
AND is_active = true
AND valid_from <= NOW()
AND valid_until >= NOW()
AND (max_uses IS NULL OR current_uses < max_uses)`,
[ctx.promotionCode]
);
if (promotion.rows.length > 0) {
const promo = promotion.rows[0];
// Check if product is eligible based on rules
const rules = promo.rules;
if (rules.eligible_products &&
!rules.eligible_products.includes(ctx.productId)) {
// Product not eligible
} else {
let discountAmount = 0;
if (promo.promotion_type === 'percentage') {
discountAmount = finalPrice * (promo.discount_value / 100);
} else if (promo.promotion_type === 'fixed') {
discountAmount = promo.discount_value;
}
finalPrice -= discountAmount;
discounts.push({ type: 'promotion', amount: discountAmount });
}
}
}
return { basePrice, finalPrice, discounts };
}
Analytics and Reporting Schema
Mixing transactional and analytical queries on the same tables causes performance problems. Modern e-commerce platforms use materialized views for common analytical queries:
```sql -- Daily sales summary materialized view CREATE MATERIALIZED VIEW daily_sales_summary AS SELECT DATE(o.created_at) as sale_date, p.category_id, c.name as category_name, COUNT(DISTINCT o.order_id) as order_count, SUM(oi.quantity) as units_sold, SUM(oi.linetotal) as gross