** PostgreSQL Indexing: The Hidden Performance Bottleneck
** Discover why your PostgreSQL queries are slow despite having indexes. Learn advanced indexing strategies, common pitfalls, and optimization techniques.
Welcome to TopperBlog! 👋
I'm a tech content creator passionate about helping developers level up their careers and master cutting-edge technologies.
🎯 What I Write About:
• AI/ML Engineering & LLMs
• Web3 & Blockchain Development
• System Design & Architecture
• Interview Preparation (FAANG)
• Freelancing & Remote Work
• Modern Tech Stacks (Next.js, React, Rust, TypeScript)
• Performance Optimization & Best Practices
💼 Mission: Sharing practical, actionable insights that accelerate your tech career and maximize your earning potential.
📚 15+ In-Depth Guides covering everything from earning $10k/month as a freelancer to cracking FAANG interviews.
🌐 Let's connect and grow together in this amazing tech journey!
#TechBlogger #SoftwareEngineering #CareerGrowth #WebDevelopment #AIEngineering
PostgreSQL Indexing: The Hidden Performance Bottleneck You're Probably Missing
You've added indexes to your PostgreSQL database. Your queries should be lightning-fast now, right? Yet somehow, your application still crawls during peak hours, and users are complaining about timeouts. The problem isn't that you lack indexes—it's that you're using them wrong.
Most developers treat indexing as a checkbox exercise: identify slow queries, add an index, move on. But PostgreSQL indexing is far more nuanced than that. The wrong index can be worse than no index at all, consuming disk space and slowing down writes without improving read performance. Understanding how PostgreSQL actually uses indexes is the difference between a database that scales gracefully and one that collapses under load.
The Real Problem: Index Blindness
Here's what typically happens. A developer notices a slow query, runs EXPLAIN ANALYZE, sees a sequential scan, and immediately creates an index on the filtered column. The query speeds up initially, but over time, performance degrades again. Why? Because PostgreSQL's query planner is sophisticated, and it doesn't always use your carefully crafted indexes the way you expect.
The issue runs deeper than just creating indexes. You need to understand index selectivity, statistics, bloat, and how different index types serve different purposes. A B-tree index that works perfectly for equality checks might be useless for pattern matching. A partial index could reduce your index size by 90% while maintaining the same performance benefits.
Understanding PostgreSQL's Index Types
PostgreSQL offers several index types, each optimized for specific use cases. Let's explore them with practical TypeScript examples using the popular pg library.
B-tree Indexes: The Default Workhorse
B-tree indexes handle most scenarios, including equality and range queries. Here's how to create and leverage them effectively:
import { Pool } from 'pg';
const pool = new Pool({
host: 'localhost',
database: 'myapp',
user: 'postgres',
password: 'password',
});
// Create a B-tree index (default type)
async function createUserEmailIndex() {
await pool.query(`
CREATE INDEX CONCURRENTLY idx_users_email
ON users(email);
`);
}
// Query that benefits from the index
async function findUserByEmail(email: string) {
const result = await pool.query(
'SELECT * FROM users WHERE email = $1',
[email]
);
return result.rows[0];
}
The CONCURRENTLY keyword is crucial here. It allows index creation without locking the table, though it takes longer to complete.
Partial Indexes: The Space Saver
Partial indexes only index rows that meet specific conditions. They're smaller, faster to maintain, and often more selective:
async function createActiveUserIndex() {
await pool.query(`
CREATE INDEX CONCURRENTLY idx_users_active_email
ON users(email)
WHERE status = 'active' AND deleted_at IS NULL;
`);
}
// This query will use the partial index
async function findActiveUser(email: string) {
const result = await pool.query(`
SELECT * FROM users
WHERE email = $1
AND status = 'active'
AND deleted_at IS NULL
`, [email]);
return result.rows[0];
}
This partial index might be 70% smaller than a full index if most users are active, dramatically reducing maintenance overhead.
Composite Indexes: Order Matters
When creating multi-column indexes, column order is critical. PostgreSQL can use a composite index for queries that filter on the leftmost columns:
async function createCompositeIndex() {
// Good: Can be used for (tenant_id) or (tenant_id, created_at)
await pool.query(`
CREATE INDEX CONCURRENTLY idx_orders_tenant_created
ON orders(tenant_id, created_at DESC);
`);
}
async function getRecentOrders(tenantId: string, limit: number = 10) {
const result = await pool.query(`
SELECT * FROM orders
WHERE tenant_id = $1
ORDER BY created_at DESC
LIMIT $2
`, [tenantId, limit]);
return result.rows;
}
Expression Indexes: For Computed Values
Sometimes you need to index the result of a function or expression:
async function createLowerEmailIndex() {
await pool.query(`
CREATE INDEX CONCURRENTLY idx_users_email_lower
ON users(LOWER(email));
`);
}
// Case-insensitive email lookup
async function findUserCaseInsensitive(email: string) {
const result = await pool.query(`
SELECT * FROM users
WHERE LOWER(email) = LOWER($1)
`, [email]);
return result.rows[0];
}
Common Pitfalls That Kill Performance
Pitfall 1: Over-Indexing
Every index slows down INSERT, UPDATE, and DELETE operations. I've seen databases with 15+ indexes on a single table, where writes became the bottleneck. Monitor your write performance and remove unused indexes:
async function findUnusedIndexes() {
const result = await pool.query(`
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;
`);
return result.rows;
}
Pitfall 2: Ignoring Index Bloat
Indexes accumulate dead tuples over time, especially in high-update tables. Bloated indexes waste space and slow down queries:
async function checkIndexBloat() {
const result = await pool.query(`
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;
`);
return result.rows;
}
async function reindexTable(tableName: string) {
await pool.query(`REINDEX TABLE CONCURRENTLY ${tableName};`);
}
Pitfall 3: Wrong Index Type for the Query
Using a B-tree index for full-text search or a GIN index for simple equality checks wastes resources. Match the index type to your query patterns.
Pitfall 4: Not Updating Statistics
PostgreSQL's query planner relies on statistics to choose indexes. Outdated statistics lead to poor execution plans:
async function updateTableStatistics(tableName: string) {
await pool.query(`ANALYZE ${tableName};`);
}
Best Practices Checklist
Follow these guidelines to maintain optimal index performance:
- Always use CONCURRENTLY when creating indexes on production tables to avoid locks
- Monitor index usage regularly and drop unused indexes
- Create partial indexes when filtering on common predicates
- Order composite index columns from most to least selective
- Run ANALYZE after bulk data changes to update statistics
- Set appropriate fillfactor for high-update tables (e.g., 70 instead of 100)
- Use covering indexes to avoid table lookups when possible
- Schedule regular REINDEX operations during maintenance windows
- Test index changes in staging with production-like data volumes
- Document index purposes so future developers understand the rationale
Frequently Asked Questions
Q: How many indexes is too many on a single table?
There's no magic number, but if you have more than 5-7 indexes on a table, scrutinize each one. The real metric is write performance—if INSERTs and UPDATEs are slow, you probably have too many indexes.
Q: Should I index foreign key columns?
Usually, yes. Foreign keys are frequently used in JOIN operations and WHERE clauses. However, if the foreign key has very low cardinality (few distinct values), the index might not be selective enough to help.
Q: When should I use REINDEX versus VACUUM?
VACUUM reclaims space from dead tuples but doesn't rebuild indexes. REINDEX completely rebuilds the index structure, eliminating bloat. Use REINDEX when an index has grown significantly larger than expected or when query performance degrades despite VACUUM.
Q: Can I have too much index selectivity?
Paradoxically, yes. If an index is so selective that it returns most rows in a table, PostgreSQL might choose a sequential scan instead because it's faster than random index lookups. This typically happens when filtering returns more than 5-10% of rows.
Q: How do I know if PostgreSQL is actually using my index?
Use EXPLAIN ANALYZE before your query. Look for "Index Scan" or "Index Only Scan" in the output. If you see "Seq Scan," the index isn't being used. Check that your WHERE clause matches the index definition exactly.
Q: What's the difference between Index Scan and Index Only Scan?
An Index Only Scan retrieves all needed data from the index itself without accessing the table. This is much faster but requires that all selected columns are in the index and that the visibility map is up to date.
Q: Should I index columns used in ORDER BY clauses?
Often, yes. An appropriate index can eliminate the need for PostgreSQL to sort results, which is expensive for large result sets. For descending sorts, create the index with DESC: CREATE INDEX idx_name ON table(column DESC).
Conclusion
PostgreSQL indexing isn't about adding more indexes—it's about adding the right indexes and maintaining them properly. The difference between a well-indexed database and a poorly-indexed one isn't just performance; it's the difference between a system that scales and one that doesn't. Start by auditing your existing indexes, remove the ones that aren't pulling their weight, and be strategic about new ones. Your database—and your users—will thank you.