Our dashboard was getting slower. Some pages took 5+ seconds to load. Users were complaining.

I ran EXPLAIN ANALYZE and found the culprit: slow database queries. After a week of optimization, I got most queries under 50ms. Here’s how.

Table of Contents

The Problem: Slow Queries

Our main dashboard query was taking 5.2 seconds:

SELECT 
    u.id,
    u.name,
    u.email,
    COUNT(o.id) as order_count,
    SUM(o.total) as total_spent,
    MAX(o.created_at) as last_order
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2016-01-01'
GROUP BY u.id, u.name, u.email
ORDER BY total_spent DESC
LIMIT 100;

This query scans 500k users and 2M orders. Way too slow.

Step 1: Analyze the Query

First, I used EXPLAIN ANALYZE to see what’s happening:

EXPLAIN ANALYZE
SELECT ...

Output:

Sort  (cost=125432.23..125432.48 rows=100 width=64) (actual time=5234.123..5234.145 rows=100 loops=1)
  Sort Key: (sum(o.total)) DESC
  ->  HashAggregate  (cost=124234.12..124534.12 rows=30000 width=64) (actual time=5123.234..5189.456 rows=450000 loops=1)
        ->  Hash Left Join  (cost=12345.00..98765.43 rows=2000000 width=32) (actual time=234.123..4567.890 rows=2000000 loops=1)
              Hash Cond: (o.user_id = u.id)
              ->  Seq Scan on orders o  (cost=0.00..45678.00 rows=2000000 width=24) (actual time=0.012..1234.567 rows=2000000 loops=1)
              ->  Hash  (cost=11234.00..11234.00 rows=88880 width=16) (actual time=234.111..234.111 rows=450000 loops=1)
                    ->  Seq Scan on users u  (cost=0.00..11234.00 rows=88880 width=16) (actual time=0.008..123.456 rows=450000 loops=1)
                          Filter: (created_at > '2016-01-01'::date)
                          Rows Removed by Filter: 50000

Key problems:

  1. Seq Scan on orders - Scanning all 2M rows
  2. Seq Scan on users - Scanning all 500k rows
  3. HashAggregate - Aggregating 2M rows

Step 2: Add Indexes

The query filters on users.created_at and joins on orders.user_id. Let’s add indexes:

-- Index on users.created_at
CREATE INDEX idx_users_created_at ON users(created_at);

-- Index on orders.user_id (for join)
CREATE INDEX idx_orders_user_id ON orders(user_id);

After adding indexes:

EXPLAIN ANALYZE
SELECT ...

New output:

Sort  (cost=45678.23..45678.48 rows=100 width=64) (actual time=1234.123..1234.145 rows=100 loops=1)
  ->  HashAggregate  (cost=44234.12..44534.12 rows=30000 width=64) (actual time=1123.234..1189.456 rows=450000 loops=1)
        ->  Hash Left Join  (cost=5678.00..38765.43 rows=2000000 width=32) (actual time=134.123..987.890 rows=2000000 loops=1)
              ->  Index Scan using idx_users_created_at on users u  (cost=0.42..1234.00 rows=88880 width=16) (actual time=0.008..23.456 rows=450000 loops=1)
                    Index Cond: (created_at > '2016-01-01'::date)
              ->  Hash  (cost=4567.00..4567.00 rows=2000000 width=24) (actual time=134.111..134.111 rows=2000000 loops=1)
                    ->  Seq Scan on orders o  (cost=0.00..4567.00 rows=2000000 width=24) (actual time=0.012..67.890 rows=2000000 loops=1)

Better! Down from 5.2s to 1.2s. But still too slow.

Step 3: Optimize the Join

The problem: we’re still scanning all orders. We only need orders for users created after 2016-01-01.

Rewrite the query:

SELECT 
    u.id,
    u.name,
    u.email,
    COUNT(o.id) as order_count,
    SUM(o.total) as total_spent,
    MAX(o.created_at) as last_order
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2016-01-01'
  AND (o.user_id IS NULL OR o.user_id IN (
    SELECT id FROM users WHERE created_at > '2016-01-01'
  ))
GROUP BY u.id, u.name, u.email
ORDER BY total_spent DESC
LIMIT 100;

Actually, that’s worse. Let me try a different approach - use a CTE:

WITH recent_users AS (
    SELECT id, name, email
    FROM users
    WHERE created_at > '2016-01-01'
)
SELECT 
    u.id,
    u.name,
    u.email,
    COUNT(o.id) as order_count,
    SUM(o.total) as total_spent,
    MAX(o.created_at) as last_order
FROM recent_users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id, u.name, u.email
ORDER BY total_spent DESC
LIMIT 100;

Result: 850ms. Better, but not great.

Step 4: Add a Composite Index

The join uses orders.user_id, and we’re aggregating total and created_at. A composite index helps:

CREATE INDEX idx_orders_user_id_total_created 
ON orders(user_id, total, created_at);

Now PostgreSQL can use index-only scans:

EXPLAIN ANALYZE
SELECT ...

Result: 320ms. Much better!

Step 5: Denormalize for Read Performance

For frequently accessed aggregates, denormalization helps. Add a summary table:

CREATE TABLE user_order_summary (
    user_id INTEGER PRIMARY KEY,
    order_count INTEGER DEFAULT 0,
    total_spent DECIMAL(10,2) DEFAULT 0,
    last_order_at TIMESTAMP,
    updated_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_user_order_summary_total_spent 
ON user_order_summary(total_spent DESC);

Update this table when orders change:

-- Trigger function
CREATE OR REPLACE FUNCTION update_user_order_summary()
RETURNS TRIGGER AS $$
BEGIN
    INSERT INTO user_order_summary (user_id, order_count, total_spent, last_order_at)
    VALUES (
        NEW.user_id,
        1,
        NEW.total,
        NEW.created_at
    )
    ON CONFLICT (user_id) DO UPDATE SET
        order_count = user_order_summary.order_count + 1,
        total_spent = user_order_summary.total_spent + NEW.total,
        last_order_at = GREATEST(user_order_summary.last_order_at, NEW.created_at),
        updated_at = NOW();
    
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Trigger
CREATE TRIGGER trg_update_user_order_summary
AFTER INSERT ON orders
FOR EACH ROW
EXECUTE FUNCTION update_user_order_summary();

Now the query is simple:

SELECT 
    u.id,
    u.name,
    u.email,
    COALESCE(s.order_count, 0) as order_count,
    COALESCE(s.total_spent, 0) as total_spent,
    s.last_order_at as last_order
FROM users u
LEFT JOIN user_order_summary s ON s.user_id = u.id
WHERE u.created_at > '2016-01-01'
ORDER BY s.total_spent DESC NULLS LAST
LIMIT 100;

Result: 45ms! 100x faster than the original.

Step 6: Add Application-Level Caching

For data that doesn’t change often, cache in Redis:

import redis
import json

redis_client = redis.Redis(host='localhost', port=6379, db=0)

def get_top_users(cache_ttl=300):
    cache_key = 'dashboard:top_users'
    
    # Try cache first
    cached = redis_client.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # Query database
    query = """
        SELECT 
            u.id,
            u.name,
            u.email,
            COALESCE(s.order_count, 0) as order_count,
            COALESCE(s.total_spent, 0) as total_spent,
            s.last_order_at as last_order
        FROM users u
        LEFT JOIN user_order_summary s ON s.user_id = u.id
        WHERE u.created_at > '2016-01-01'
        ORDER BY s.total_spent DESC NULLS LAST
        LIMIT 100
    """
    
    result = db.execute(query).fetchall()
    data = [dict(row) for row in result]
    
    # Cache for 5 minutes
    redis_client.setex(cache_key, cache_ttl, json.dumps(data))
    
    return data

First request: 45ms (database) Subsequent requests: <1ms (cache)

Step 7: Optimize Other Slow Queries

I found more slow queries using pg_stat_statements:

-- Enable pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find slowest queries
SELECT 
    query,
    calls,
    total_time,
    mean_time,
    max_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 20;

This showed several N+1 query problems in our code.

N+1 Query Problem

Bad code:

# Get all users
users = User.query.all()

# For each user, get their orders (N+1 queries!)
for user in users:
    orders = Order.query.filter_by(user_id=user.id).all()
    print(f"{user.name}: {len(orders)} orders")

This executes 1 query for users + N queries for orders = N+1 queries.

Fixed with eager loading:

from sqlalchemy.orm import joinedload

# Single query with JOIN
users = User.query.options(joinedload(User.orders)).all()

for user in users:
    print(f"{user.name}: {len(user.orders)} orders")

Now it’s just 1 query.

Partial Indexes

For queries that filter on specific values, partial indexes are smaller and faster:

-- Instead of indexing all orders
CREATE INDEX idx_orders_status ON orders(status);

-- Index only pending orders (most queries filter for this)
CREATE INDEX idx_orders_pending ON orders(status) 
WHERE status = 'pending';

Partial indexes are smaller, so they fit in memory better.

VACUUM and ANALYZE

PostgreSQL needs regular maintenance:

-- Update statistics for query planner
ANALYZE;

-- Reclaim space from deleted rows
VACUUM;

-- Both at once
VACUUM ANALYZE;

I set up a cron job to run this nightly:

# /etc/cron.d/postgres-maintenance
0 2 * * * postgres psql -d myapp -c "VACUUM ANALYZE;"

Connection Pooling

Our app was creating a new database connection for every request. This is slow.

I added PgBouncer for connection pooling:

# /etc/pgbouncer/pgbouncer.ini
[databases]
myapp = host=localhost port=5432 dbname=myapp

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 100
default_pool_size = 20

Application now connects to PgBouncer (port 6432) instead of PostgreSQL directly (port 5432).

Result: Connection overhead reduced from ~10ms to <1ms.

Monitoring

I set up monitoring with pg_stat_statements and Grafana:

-- Queries per second
SELECT 
    COUNT(*) / 60.0 as qps
FROM pg_stat_statements
WHERE query_start > NOW() - INTERVAL '1 minute';

-- Cache hit ratio (should be >99%)
SELECT 
    sum(blks_hit) / (sum(blks_hit) + sum(blks_read)) as cache_hit_ratio
FROM pg_stat_database;

-- Active connections
SELECT COUNT(*) FROM pg_stat_activity WHERE state = 'active';

I alert when:

  • Cache hit ratio < 95%
  • Active connections > 80
  • Slow queries > 1s

Results Summary

OptimizationQuery TimeImprovement
Original5200ms-
Added indexes1200ms77%
Optimized join850ms84%
Composite index320ms94%
Denormalization45ms99.1%
Redis cache<1ms99.98%

Lessons Learned

  1. Measure first - Use EXPLAIN ANALYZE before optimizing
  2. Indexes are crucial - But don’t over-index
  3. Denormalize for reads - Write complexity for read speed
  4. Cache aggressively - Redis is fast
  5. Monitor continuously - Performance degrades over time

Common Mistakes

1. Too many indexes

Every index slows down writes. I had 15 indexes on the orders table. Removed 8 unused ones.

2. Not using LIMIT

Always limit results:

-- Bad: returns all rows
SELECT * FROM users;

-- Good: returns only what you need
SELECT * FROM users LIMIT 100;

**3. SELECT ***

Only select columns you need:

-- Bad: transfers unnecessary data
SELECT * FROM users;

-- Good: only what's needed
SELECT id, name, email FROM users;

4. Implicit type conversions

-- Bad: can't use index (user_id is integer)
WHERE user_id = '123'

-- Good: correct type
WHERE user_id = 123

Tools I Use

  • EXPLAIN ANALYZE - Query execution plans
  • pg_stat_statements - Query statistics
  • pgBadger - Log analyzer
  • PgBouncer - Connection pooling
  • Redis - Caching layer

Conclusion

Database optimization is an ongoing process. These changes took a week but improved user experience dramatically.

Key takeaways:

  • Use EXPLAIN ANALYZE to find bottlenecks
  • Add indexes strategically
  • Denormalize for read-heavy workloads
  • Cache frequently accessed data
  • Monitor query performance

Our dashboard now loads in under 100ms. Users are happy, and so am I.

Performance matters. Invest time in optimization.