PostgreSQL Query Optimization: From 5s to 50ms
Our dashboard was getting slower. Some pages took 5+ seconds to load. Users were complaining.
I ran EXPLAIN ANALYZE and found the culprit: slow database queries. After a week of optimization, I got most queries under 50ms. Here’s how.
Table of Contents
The Problem: Slow Queries
Our main dashboard query was taking 5.2 seconds:
SELECT
u.id,
u.name,
u.email,
COUNT(o.id) as order_count,
SUM(o.total) as total_spent,
MAX(o.created_at) as last_order
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2016-01-01'
GROUP BY u.id, u.name, u.email
ORDER BY total_spent DESC
LIMIT 100;
This query scans 500k users and 2M orders. Way too slow.
Step 1: Analyze the Query
First, I used EXPLAIN ANALYZE to see what’s happening:
EXPLAIN ANALYZE
SELECT ...
Output:
Sort (cost=125432.23..125432.48 rows=100 width=64) (actual time=5234.123..5234.145 rows=100 loops=1)
Sort Key: (sum(o.total)) DESC
-> HashAggregate (cost=124234.12..124534.12 rows=30000 width=64) (actual time=5123.234..5189.456 rows=450000 loops=1)
-> Hash Left Join (cost=12345.00..98765.43 rows=2000000 width=32) (actual time=234.123..4567.890 rows=2000000 loops=1)
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders o (cost=0.00..45678.00 rows=2000000 width=24) (actual time=0.012..1234.567 rows=2000000 loops=1)
-> Hash (cost=11234.00..11234.00 rows=88880 width=16) (actual time=234.111..234.111 rows=450000 loops=1)
-> Seq Scan on users u (cost=0.00..11234.00 rows=88880 width=16) (actual time=0.008..123.456 rows=450000 loops=1)
Filter: (created_at > '2016-01-01'::date)
Rows Removed by Filter: 50000
Key problems:
- Seq Scan on orders - Scanning all 2M rows
- Seq Scan on users - Scanning all 500k rows
- HashAggregate - Aggregating 2M rows
Step 2: Add Indexes
The query filters on users.created_at and joins on orders.user_id. Let’s add indexes:
-- Index on users.created_at
CREATE INDEX idx_users_created_at ON users(created_at);
-- Index on orders.user_id (for join)
CREATE INDEX idx_orders_user_id ON orders(user_id);
After adding indexes:
EXPLAIN ANALYZE
SELECT ...
New output:
Sort (cost=45678.23..45678.48 rows=100 width=64) (actual time=1234.123..1234.145 rows=100 loops=1)
-> HashAggregate (cost=44234.12..44534.12 rows=30000 width=64) (actual time=1123.234..1189.456 rows=450000 loops=1)
-> Hash Left Join (cost=5678.00..38765.43 rows=2000000 width=32) (actual time=134.123..987.890 rows=2000000 loops=1)
-> Index Scan using idx_users_created_at on users u (cost=0.42..1234.00 rows=88880 width=16) (actual time=0.008..23.456 rows=450000 loops=1)
Index Cond: (created_at > '2016-01-01'::date)
-> Hash (cost=4567.00..4567.00 rows=2000000 width=24) (actual time=134.111..134.111 rows=2000000 loops=1)
-> Seq Scan on orders o (cost=0.00..4567.00 rows=2000000 width=24) (actual time=0.012..67.890 rows=2000000 loops=1)
Better! Down from 5.2s to 1.2s. But still too slow.
Step 3: Optimize the Join
The problem: we’re still scanning all orders. We only need orders for users created after 2016-01-01.
Rewrite the query:
SELECT
u.id,
u.name,
u.email,
COUNT(o.id) as order_count,
SUM(o.total) as total_spent,
MAX(o.created_at) as last_order
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2016-01-01'
AND (o.user_id IS NULL OR o.user_id IN (
SELECT id FROM users WHERE created_at > '2016-01-01'
))
GROUP BY u.id, u.name, u.email
ORDER BY total_spent DESC
LIMIT 100;
Actually, that’s worse. Let me try a different approach - use a CTE:
WITH recent_users AS (
SELECT id, name, email
FROM users
WHERE created_at > '2016-01-01'
)
SELECT
u.id,
u.name,
u.email,
COUNT(o.id) as order_count,
SUM(o.total) as total_spent,
MAX(o.created_at) as last_order
FROM recent_users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.name, u.email
ORDER BY total_spent DESC
LIMIT 100;
Result: 850ms. Better, but not great.
Step 4: Add a Composite Index
The join uses orders.user_id, and we’re aggregating total and created_at. A composite index helps:
CREATE INDEX idx_orders_user_id_total_created
ON orders(user_id, total, created_at);
Now PostgreSQL can use index-only scans:
EXPLAIN ANALYZE
SELECT ...
Result: 320ms. Much better!
Step 5: Denormalize for Read Performance
For frequently accessed aggregates, denormalization helps. Add a summary table:
CREATE TABLE user_order_summary (
user_id INTEGER PRIMARY KEY,
order_count INTEGER DEFAULT 0,
total_spent DECIMAL(10,2) DEFAULT 0,
last_order_at TIMESTAMP,
updated_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_user_order_summary_total_spent
ON user_order_summary(total_spent DESC);
Update this table when orders change:
-- Trigger function
CREATE OR REPLACE FUNCTION update_user_order_summary()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO user_order_summary (user_id, order_count, total_spent, last_order_at)
VALUES (
NEW.user_id,
1,
NEW.total,
NEW.created_at
)
ON CONFLICT (user_id) DO UPDATE SET
order_count = user_order_summary.order_count + 1,
total_spent = user_order_summary.total_spent + NEW.total,
last_order_at = GREATEST(user_order_summary.last_order_at, NEW.created_at),
updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Trigger
CREATE TRIGGER trg_update_user_order_summary
AFTER INSERT ON orders
FOR EACH ROW
EXECUTE FUNCTION update_user_order_summary();
Now the query is simple:
SELECT
u.id,
u.name,
u.email,
COALESCE(s.order_count, 0) as order_count,
COALESCE(s.total_spent, 0) as total_spent,
s.last_order_at as last_order
FROM users u
LEFT JOIN user_order_summary s ON s.user_id = u.id
WHERE u.created_at > '2016-01-01'
ORDER BY s.total_spent DESC NULLS LAST
LIMIT 100;
Result: 45ms! 100x faster than the original.
Step 6: Add Application-Level Caching
For data that doesn’t change often, cache in Redis:
import redis
import json
redis_client = redis.Redis(host='localhost', port=6379, db=0)
def get_top_users(cache_ttl=300):
cache_key = 'dashboard:top_users'
# Try cache first
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
# Query database
query = """
SELECT
u.id,
u.name,
u.email,
COALESCE(s.order_count, 0) as order_count,
COALESCE(s.total_spent, 0) as total_spent,
s.last_order_at as last_order
FROM users u
LEFT JOIN user_order_summary s ON s.user_id = u.id
WHERE u.created_at > '2016-01-01'
ORDER BY s.total_spent DESC NULLS LAST
LIMIT 100
"""
result = db.execute(query).fetchall()
data = [dict(row) for row in result]
# Cache for 5 minutes
redis_client.setex(cache_key, cache_ttl, json.dumps(data))
return data
First request: 45ms (database) Subsequent requests: <1ms (cache)
Step 7: Optimize Other Slow Queries
I found more slow queries using pg_stat_statements:
-- Enable pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- Find slowest queries
SELECT
query,
calls,
total_time,
mean_time,
max_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 20;
This showed several N+1 query problems in our code.
N+1 Query Problem
Bad code:
# Get all users
users = User.query.all()
# For each user, get their orders (N+1 queries!)
for user in users:
orders = Order.query.filter_by(user_id=user.id).all()
print(f"{user.name}: {len(orders)} orders")
This executes 1 query for users + N queries for orders = N+1 queries.
Fixed with eager loading:
from sqlalchemy.orm import joinedload
# Single query with JOIN
users = User.query.options(joinedload(User.orders)).all()
for user in users:
print(f"{user.name}: {len(user.orders)} orders")
Now it’s just 1 query.
Partial Indexes
For queries that filter on specific values, partial indexes are smaller and faster:
-- Instead of indexing all orders
CREATE INDEX idx_orders_status ON orders(status);
-- Index only pending orders (most queries filter for this)
CREATE INDEX idx_orders_pending ON orders(status)
WHERE status = 'pending';
Partial indexes are smaller, so they fit in memory better.
VACUUM and ANALYZE
PostgreSQL needs regular maintenance:
-- Update statistics for query planner
ANALYZE;
-- Reclaim space from deleted rows
VACUUM;
-- Both at once
VACUUM ANALYZE;
I set up a cron job to run this nightly:
# /etc/cron.d/postgres-maintenance
0 2 * * * postgres psql -d myapp -c "VACUUM ANALYZE;"
Connection Pooling
Our app was creating a new database connection for every request. This is slow.
I added PgBouncer for connection pooling:
# /etc/pgbouncer/pgbouncer.ini
[databases]
myapp = host=localhost port=5432 dbname=myapp
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 100
default_pool_size = 20
Application now connects to PgBouncer (port 6432) instead of PostgreSQL directly (port 5432).
Result: Connection overhead reduced from ~10ms to <1ms.
Monitoring
I set up monitoring with pg_stat_statements and Grafana:
-- Queries per second
SELECT
COUNT(*) / 60.0 as qps
FROM pg_stat_statements
WHERE query_start > NOW() - INTERVAL '1 minute';
-- Cache hit ratio (should be >99%)
SELECT
sum(blks_hit) / (sum(blks_hit) + sum(blks_read)) as cache_hit_ratio
FROM pg_stat_database;
-- Active connections
SELECT COUNT(*) FROM pg_stat_activity WHERE state = 'active';
I alert when:
- Cache hit ratio < 95%
- Active connections > 80
- Slow queries > 1s
Results Summary
| Optimization | Query Time | Improvement |
|---|---|---|
| Original | 5200ms | - |
| Added indexes | 1200ms | 77% |
| Optimized join | 850ms | 84% |
| Composite index | 320ms | 94% |
| Denormalization | 45ms | 99.1% |
| Redis cache | <1ms | 99.98% |
Lessons Learned
- Measure first - Use EXPLAIN ANALYZE before optimizing
- Indexes are crucial - But don’t over-index
- Denormalize for reads - Write complexity for read speed
- Cache aggressively - Redis is fast
- Monitor continuously - Performance degrades over time
Common Mistakes
1. Too many indexes
Every index slows down writes. I had 15 indexes on the orders table. Removed 8 unused ones.
2. Not using LIMIT
Always limit results:
-- Bad: returns all rows
SELECT * FROM users;
-- Good: returns only what you need
SELECT * FROM users LIMIT 100;
**3. SELECT ***
Only select columns you need:
-- Bad: transfers unnecessary data
SELECT * FROM users;
-- Good: only what's needed
SELECT id, name, email FROM users;
4. Implicit type conversions
-- Bad: can't use index (user_id is integer)
WHERE user_id = '123'
-- Good: correct type
WHERE user_id = 123
Tools I Use
- EXPLAIN ANALYZE - Query execution plans
- pg_stat_statements - Query statistics
- pgBadger - Log analyzer
- PgBouncer - Connection pooling
- Redis - Caching layer
Conclusion
Database optimization is an ongoing process. These changes took a week but improved user experience dramatically.
Key takeaways:
- Use EXPLAIN ANALYZE to find bottlenecks
- Add indexes strategically
- Denormalize for read-heavy workloads
- Cache frequently accessed data
- Monitor query performance
Our dashboard now loads in under 100ms. Users are happy, and so am I.
Performance matters. Invest time in optimization.