Our application’s database queries were getting slower as data grew. Here’s how I optimized PostgreSQL to handle 10x more data with 25x faster queries.

Table of contents

The Problem

Symptoms:

  • Slow page loads (2-3 seconds)
  • Database CPU at 80%+
  • Growing query times as data increased
  • User complaints

Database Stats:

  • 50M rows in main table
  • 200 queries per second
  • Average query time: 500ms
  • P95 query time: 2000ms

Step 1: Identify Slow Queries

Enable Query Logging

-- postgresql.conf
log_min_duration_statement = 100  # Log queries > 100ms
log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
log_statement = 'none'
log_duration = off

Use pg_stat_statements

-- Enable extension
CREATE EXTENSION pg_stat_statements;

-- Find slowest queries
SELECT 
    query,
    calls,
    total_time,
    mean_time,
    max_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;

Result: Found 5 queries consuming 80% of database time.

Step 2: Analyze Query Plans

Slow Query Example

SELECT u.*, p.*, o.* 
FROM users u
LEFT JOIN profiles p ON u.id = p.user_id
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2022-01-01'
AND u.status = 'active'
ORDER BY u.created_at DESC
LIMIT 20;

Execution time: 850ms

Use EXPLAIN ANALYZE

EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT u.*, p.*, o.* 
FROM users u
LEFT JOIN profiles p ON u.id = p.user_id
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2022-01-01'
AND u.status = 'active'
ORDER BY u.created_at DESC
LIMIT 20;

Findings:

  • Sequential scan on users table (50M rows)
  • No index on created_at
  • No index on status
  • Inefficient join order

Step 3: Add Strategic Indexes

Composite Index

-- Create composite index
CREATE INDEX CONCURRENTLY idx_users_status_created_at 
ON users(status, created_at DESC) 
WHERE status = 'active';

-- Analyze table
ANALYZE users;

Result: Query time reduced to 120ms (7x improvement)

Covering Index

-- Include frequently accessed columns
CREATE INDEX CONCURRENTLY idx_users_covering
ON users(status, created_at DESC)
INCLUDE (email, name, updated_at)
WHERE status = 'active';

Result: Query time reduced to 45ms (19x improvement)

Partial Index for Common Filters

-- Index only active users
CREATE INDEX CONCURRENTLY idx_users_active
ON users(created_at DESC)
WHERE status = 'active';

-- Index for recent users
CREATE INDEX CONCURRENTLY idx_users_recent
ON users(id, status)
WHERE created_at > CURRENT_DATE - INTERVAL '30 days';

Step 4: Optimize Queries

Before Optimization

-- Inefficient: Fetches all columns
SELECT * FROM orders 
WHERE user_id IN (
    SELECT id FROM users WHERE status = 'active'
);

Execution time: 650ms

After Optimization

-- Efficient: Select only needed columns, use JOIN
SELECT o.id, o.total, o.created_at
FROM orders o
INNER JOIN users u ON o.user_id = u.id
WHERE u.status = 'active'
AND o.created_at > CURRENT_DATE - INTERVAL '90 days';

Execution time: 35ms (18x improvement)

Use CTEs Wisely

-- Materialize expensive subquery
WITH active_users AS MATERIALIZED (
    SELECT id, email 
    FROM users 
    WHERE status = 'active'
    AND last_login > CURRENT_DATE - INTERVAL '7 days'
)
SELECT 
    au.email,
    COUNT(o.id) as order_count,
    SUM(o.total) as total_spent
FROM active_users au
LEFT JOIN orders o ON au.id = o.user_id
GROUP BY au.email;

Step 5: Configuration Tuning

Memory Settings

# postgresql.conf

# Shared buffers (25% of RAM)
shared_buffers = 8GB

# Effective cache size (50-75% of RAM)
effective_cache_size = 24GB

# Work mem (for sorting/hashing)
work_mem = 64MB

# Maintenance work mem (for VACUUM, CREATE INDEX)
maintenance_work_mem = 2GB

Connection Settings

# Max connections
max_connections = 200

# Connection pooling with PgBouncer
# pgbouncer.ini
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

Checkpoint Settings

# Reduce checkpoint frequency
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
max_wal_size = 4GB
min_wal_size = 1GB

Query Planner Settings

# Random page cost (lower for SSD)
random_page_cost = 1.1

# Effective IO concurrency
effective_io_concurrency = 200

# Parallel query settings
max_parallel_workers_per_gather = 4
max_parallel_workers = 8

Step 6: Partitioning Large Tables

Time-Based Partitioning

-- Create partitioned table
CREATE TABLE orders_partitioned (
    id BIGSERIAL,
    user_id BIGINT NOT NULL,
    total DECIMAL(10,2),
    created_at TIMESTAMP NOT NULL,
    PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE orders_2022_q1 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2022-01-01') TO ('2022-04-01');

CREATE TABLE orders_2022_q2 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2022-04-01') TO ('2022-07-01');

CREATE TABLE orders_2022_q3 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2022-07-01') TO ('2022-10-01');

CREATE TABLE orders_2022_q4 PARTITION OF orders_partitioned
    FOR VALUES FROM ('2022-10-01') TO ('2023-01-01');

-- Create indexes on each partition
CREATE INDEX ON orders_2022_q1(user_id);
CREATE INDEX ON orders_2022_q2(user_id);
CREATE INDEX ON orders_2022_q3(user_id);
CREATE INDEX ON orders_2022_q4(user_id);

Result: Queries on recent data 10x faster

Step 7: Vacuum and Analyze

Regular Maintenance

-- Manual vacuum
VACUUM ANALYZE users;

-- Aggressive vacuum for bloated tables
VACUUM FULL ANALYZE orders;

Autovacuum Tuning

# postgresql.conf
autovacuum = on
autovacuum_max_workers = 4
autovacuum_naptime = 30s

# Per-table settings for high-traffic tables
ALTER TABLE users SET (
    autovacuum_vacuum_scale_factor = 0.05,
    autovacuum_analyze_scale_factor = 0.02
);

Step 8: Connection Pooling

PgBouncer Configuration

# pgbouncer.ini
[databases]
mydb = host=localhost port=5432 dbname=production

[pgbouncer]
listen_addr = *
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3

server_idle_timeout = 600
server_lifetime = 3600

Result: Reduced connection overhead by 60%

Step 9: Monitoring

Key Metrics to Track

-- Cache hit ratio (should be > 99%)
SELECT 
    sum(heap_blks_read) as heap_read,
    sum(heap_blks_hit) as heap_hit,
    sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as ratio
FROM pg_statio_user_tables;

-- Index usage
SELECT 
    schemaname,
    tablename,
    indexname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;

-- Table bloat
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size,
    n_dead_tup,
    n_live_tup,
    round(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) as dead_ratio
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Prometheus Exporter

# docker-compose.yml
services:
  postgres-exporter:
    image: prometheuscommunity/postgres-exporter
    environment:
      DATA_SOURCE_NAME: "postgresql://user:password@postgres:5432/mydb?sslmode=disable"
    ports:
      - "9187:9187"

Results

Performance Improvements

MetricBeforeAfterImprovement
Avg Query Time500ms20ms25x
P95 Query Time2000ms80ms25x
P99 Query Time5000ms200ms25x
Queries/Second200200010x
CPU Usage80%25%3.2x
Cache Hit Ratio85%99.5%17%

Cost Savings

  • Before: 4x large instances ($800/month)
  • After: 2x medium instances ($300/month)
  • Savings: $500/month (62.5%)

Best Practices Learned

1. Index Strategy

-- DO: Create indexes for WHERE, JOIN, ORDER BY columns
CREATE INDEX idx_users_email ON users(email);

-- DON'T: Over-index (slows down writes)
-- Avoid indexes on columns with low cardinality

2. Query Patterns

-- DO: Select only needed columns
SELECT id, name, email FROM users;

-- DON'T: Use SELECT *
SELECT * FROM users;

3. Use EXPLAIN

-- Always analyze before optimizing
EXPLAIN (ANALYZE, BUFFERS) 
SELECT ...;

4. Regular Maintenance

# Weekly vacuum
0 2 * * 0 psql -c "VACUUM ANALYZE;"

# Monthly full vacuum on bloated tables
0 3 1 * * psql -c "VACUUM FULL ANALYZE orders;"

Common Pitfalls

1. Too Many Indexes

More indexes = slower writes. Balance read vs write performance.

2. Not Using EXPLAIN

Don’t guess—measure with EXPLAIN ANALYZE.

3. Ignoring Autovacuum

Bloated tables kill performance. Monitor and tune autovacuum.

4. Wrong Data Types

-- BAD: Using TEXT for fixed-length data
email TEXT

-- GOOD: Use appropriate types
email VARCHAR(255)

5. N+1 Queries

# BAD: N+1 queries
users = User.query.all()
for user in users:
    orders = Order.query.filter_by(user_id=user.id).all()

# GOOD: Single query with JOIN
users = User.query.options(joinedload(User.orders)).all()

Conclusion

PostgreSQL performance tuning is an iterative process:

  1. Measure: Identify slow queries
  2. Analyze: Use EXPLAIN to understand why
  3. Optimize: Add indexes, rewrite queries
  4. Configure: Tune PostgreSQL settings
  5. Monitor: Track metrics continuously

Key Takeaways:

  • Proper indexing is crucial
  • EXPLAIN is your best friend
  • Configuration matters
  • Regular maintenance is essential
  • Monitor everything

Final Results: 25x faster queries, 62.5% cost reduction, happier users.