Redis Persistence: RDB vs AOF in Production
We lost all our Redis data when the server crashed. Session data, cache, everything gone. Users were logged out, pages were slow while cache rebuilt.
I learned about Redis persistence the hard way. Now we use AOF + RDB hybrid approach. Last crash? Zero data loss.
Table of Contents
The Crash
Friday, 4 PM: Server power failure
Friday, 4:15 PM: Server back online
Friday, 4:16 PM: Redis starts with empty dataset
Friday, 4:17 PM: All users logged out, cache cold
We were using Redis with default config - no persistence. All data in memory, nothing on disk.
Redis Persistence Options
Redis offers two persistence methods:
- RDB (Redis Database) - Point-in-time snapshots
- AOF (Append Only File) - Log of every write operation
RDB Snapshots
RDB saves dataset to disk at intervals.
Default config (redis.conf):
save 900 1 # Save after 900 seconds if at least 1 key changed
save 300 10 # Save after 300 seconds if at least 10 keys changed
save 60 10000 # Save after 60 seconds if at least 10000 keys changed
Manual snapshot:
redis-cli BGSAVE
Creates dump.rdb file.
RDB Pros
- Compact - Single file, easy to backup
- Fast recovery - Loading RDB is faster than replaying AOF
- Good for backups - Copy
dump.rdbto backup location - Minimal performance impact - Fork process, parent continues serving
RDB Cons
- Data loss - Can lose data between snapshots
- Fork can be slow - On large datasets, fork takes time
- Not real-time - Minutes of data loss possible
AOF (Append Only File)
AOF logs every write operation.
Enable in redis.conf:
appendonly yes
appendfilename "appendonly.aof"
AOF Sync Policies
Three options:
1. appendfsync always - Sync after every write
appendfsync always
- Safest (no data loss)
- Slowest (disk I/O for every write)
2. appendfsync everysec - Sync every second (default)
appendfsync everysec
- Good balance
- Can lose 1 second of data
- Recommended for most cases
3. appendfsync no - Let OS decide when to sync
appendfsync no
- Fastest
- Can lose more data
- Not recommended
AOF Rewrite
AOF file grows over time. Redis can rewrite it:
redis-cli BGREWRITEAOF
Or automatic:
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
Rewrites when AOF is 100% larger than last rewrite and at least 64MB.
AOF Pros
- Durable - Can lose at most 1 second of data
- Append-only - No corruption from crashes
- Readable - AOF is text file, can be edited
- Automatic rewrite - Keeps file size manageable
AOF Cons
- Larger files - AOF bigger than RDB
- Slower recovery - Replaying AOF takes longer
- Slightly slower - More disk I/O than RDB
RDB + AOF Hybrid
Best of both worlds:
# Enable both
save 900 1
save 300 10
save 60 10000
appendonly yes
appendfsync everysec
On restart:
- Redis loads AOF (more complete)
- Falls back to RDB if AOF doesn’t exist
Our Configuration
Production redis.conf:
# RDB snapshots
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
# AOF
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
# Memory
maxmemory 2gb
maxmemory-policy allkeys-lru
Testing Persistence
Simulate crash:
# Write some data
redis-cli SET test "hello"
redis-cli SET user:1 "john"
# Kill Redis (simulate crash)
kill -9 $(pgrep redis-server)
# Start Redis
redis-server /etc/redis/redis.conf
# Check data
redis-cli GET test
# Output: "hello"
Data survived!
Backup Strategy
Daily backups:
#!/bin/bash
# backup-redis.sh
DATE=$(date +%Y%m%d)
BACKUP_DIR=/backups/redis
# Trigger RDB snapshot
redis-cli BGSAVE
# Wait for snapshot to complete
while [ $(redis-cli LASTSAVE) -eq $LAST_SAVE ]; do
sleep 1
done
# Copy RDB file
cp /var/lib/redis/dump.rdb $BACKUP_DIR/dump-$DATE.rdb
# Copy AOF file
cp /var/lib/redis/appendonly.aof $BACKUP_DIR/appendonly-$DATE.aof
# Keep last 7 days
find $BACKUP_DIR -name "dump-*.rdb" -mtime +7 -delete
find $BACKUP_DIR -name "appendonly-*.aof" -mtime +7 -delete
Cron job:
0 2 * * * /usr/local/bin/backup-redis.sh
Monitoring Persistence
Check last save time:
redis-cli LASTSAVE
Check if save is in progress:
redis-cli INFO persistence
Output:
# Persistence
loading:0
rdb_changes_since_last_save:42
rdb_bgsave_in_progress:0
rdb_last_save_time:1481385600
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:1
aof_enabled:1
aof_rewrite_in_progress:0
aof_last_rewrite_time_sec:-1
aof_current_size:1024
aof_base_size:512
Recovery from Backup
If Redis won’t start:
# Stop Redis
systemctl stop redis
# Restore from backup
cp /backups/redis/dump-20161210.rdb /var/lib/redis/dump.rdb
cp /backups/redis/appendonly-20161210.aof /var/lib/redis/appendonly.aof
# Fix permissions
chown redis:redis /var/lib/redis/*
# Start Redis
systemctl start redis
AOF Corruption
If AOF is corrupted:
# Check AOF
redis-check-aof appendonly.aof
# Fix AOF (removes corrupted part)
redis-check-aof --fix appendonly.aof
Performance Impact
Measured on our server (2GB dataset):
| Config | Write Ops/sec | Save Time |
|---|---|---|
| No persistence | 85,000 | N/A |
| RDB only | 82,000 | 2.3s |
| AOF (everysec) | 78,000 | N/A |
| RDB + AOF | 76,000 | 2.5s |
AOF reduces throughput by ~10%, but worth it for durability.
When to Use What
RDB only:
- Cache that can be rebuilt
- Data loss acceptable
- Fast recovery needed
AOF only:
- Critical data
- Can’t afford data loss
- Recovery time not critical
RDB + AOF (recommended):
- Production systems
- Best balance of safety and performance
- Our choice
Lessons Learned
- Always enable persistence - Unless data is truly disposable
- Use AOF for critical data - Can’t afford to lose sessions
- Test recovery - Simulate crashes, verify data survives
- Monitor persistence - Check LASTSAVE, INFO persistence
- Backup regularly - Copy RDB/AOF files offsite
Results
Before:
- No persistence
- Lost all data on crash
- Users logged out
- Cache rebuild took 30 minutes
After:
- RDB + AOF enabled
- Zero data loss on crash
- Users stay logged in
- Instant recovery
Conclusion
Redis persistence is essential for production. Don’t learn this lesson the hard way like we did.
Key takeaways:
- Enable persistence (RDB + AOF)
- Use appendfsync everysec for balance
- Backup RDB and AOF files
- Test recovery procedures
- Monitor persistence status
Redis is fast, but speed means nothing if you lose your data. Configure persistence properly.