Building Microservices with Python and Flask
We’ve been running a monolithic Django application for two years, and it’s starting to show its age. The codebase is over 100k lines, deployments take 30+ minutes, and any bug can bring down the entire system.
I spent the last month building our first microservice using Flask. Here’s what I learned about breaking up a monolith.
Table of Contents
Why Flask Over Django?
Our Django monolith does everything - user management, payments, notifications, reporting. It’s tightly coupled and hard to change.
For microservices, I chose Flask over Django because:
- Lightweight - Flask is minimal, no batteries included
- Flexible - No ORM or admin forced on you
- Fast startup - Perfect for small, focused services
- Easy to understand - The entire framework is ~7k lines
Django is great for monoliths, but overkill for a service that just sends emails.
The First Service: Email Notifications
I started with the simplest service - email notifications. In the monolith, this was scattered across 20+ files. Perfect candidate for extraction.
Basic Flask app structure:
# app.py
from flask import Flask, request, jsonify
import smtplib
from email.mime.text import MIMEText
app = Flask(__name__)
@app.route('/health')
def health():
return jsonify({'status': 'healthy'})
@app.route('/send', methods=['POST'])
def send_email():
data = request.get_json()
# Validate input
if not data or 'to' not in data or 'subject' not in data:
return jsonify({'error': 'Missing required fields'}), 400
try:
send_smtp_email(
to=data['to'],
subject=data['subject'],
body=data.get('body', '')
)
return jsonify({'status': 'sent'}), 200
except Exception as e:
return jsonify({'error': str(e)}), 500
def send_smtp_email(to, subject, body):
msg = MIMEText(body)
msg['Subject'] = subject
msg['From'] = 'noreply@example.com'
msg['To'] = to
with smtplib.SMTP('localhost', 25) as server:
server.send_message(msg)
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
This is about 30 lines. The equivalent Django code was spread across models, views, tasks, and templates.
Configuration Management
I use environment variables for configuration:
# config.py
import os
class Config:
SMTP_HOST = os.getenv('SMTP_HOST', 'localhost')
SMTP_PORT = int(os.getenv('SMTP_PORT', 25))
SMTP_USER = os.getenv('SMTP_USER', '')
SMTP_PASSWORD = os.getenv('SMTP_PASSWORD', '')
FROM_EMAIL = os.getenv('FROM_EMAIL', 'noreply@example.com')
# app.py
from config import Config
app.config.from_object(Config)
This makes it easy to change settings per environment without code changes.
Adding a Database
For storing email history, I used SQLAlchemy (not Django ORM):
from flask_sqlalchemy import SQLAlchemy
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://localhost/emails'
db = SQLAlchemy(app)
class EmailLog(db.Model):
id = db.Column(db.Integer, primary_key=True)
to = db.Column(db.String(255), nullable=False)
subject = db.Column(db.String(255), nullable=False)
body = db.Column(db.Text)
sent_at = db.Column(db.DateTime, default=db.func.now())
status = db.Column(db.String(50))
def to_dict(self):
return {
'id': self.id,
'to': self.to,
'subject': self.subject,
'sent_at': self.sent_at.isoformat(),
'status': self.status
}
@app.route('/send', methods=['POST'])
def send_email():
data = request.get_json()
# Create log entry
log = EmailLog(
to=data['to'],
subject=data['subject'],
body=data.get('body', ''),
status='pending'
)
db.session.add(log)
db.session.commit()
try:
send_smtp_email(data['to'], data['subject'], data.get('body', ''))
log.status = 'sent'
except Exception as e:
log.status = 'failed'
db.session.commit()
return jsonify({'error': str(e)}), 500
db.session.commit()
return jsonify(log.to_dict()), 200
SQLAlchemy is more verbose than Django ORM, but it’s also more explicit. I like that.
Service Communication
The monolith calls this service via HTTP:
# In Django monolith
import requests
def send_notification(user_email, subject, body):
response = requests.post('http://email-service:5000/send', json={
'to': user_email,
'subject': subject,
'body': body
})
return response.json()
Simple, but synchronous. If the email service is down, the request fails. I’ll need to add a message queue later.
Error Handling
Flask’s default error handling is minimal. I added custom handlers:
@app.errorhandler(404)
def not_found(error):
return jsonify({'error': 'Not found'}), 404
@app.errorhandler(500)
def internal_error(error):
db.session.rollback()
return jsonify({'error': 'Internal server error'}), 500
@app.errorhandler(Exception)
def handle_exception(e):
app.logger.error(f'Unhandled exception: {e}')
return jsonify({'error': 'Internal server error'}), 500
This ensures all errors return JSON, not HTML.
Logging
I set up structured logging:
import logging
from logging.handlers import RotatingFileHandler
if not app.debug:
file_handler = RotatingFileHandler('logs/email-service.log',
maxBytes=10240000,
backupCount=10)
file_handler.setFormatter(logging.Formatter(
'%(asctime)s %(levelname)s: %(message)s [in %(pathname)s:%(lineno)d]'
))
file_handler.setLevel(logging.INFO)
app.logger.addHandler(file_handler)
app.logger.setLevel(logging.INFO)
app.logger.info('Email service startup')
Now I can track what’s happening in production.
Testing
Flask makes testing easy:
# test_app.py
import unittest
from app import app, db
class EmailServiceTestCase(unittest.TestCase):
def setUp(self):
app.config['TESTING'] = True
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///:memory:'
self.client = app.test_client()
with app.app_context():
db.create_all()
def tearDown(self):
with app.app_context():
db.session.remove()
db.drop_all()
def test_health_check(self):
response = self.client.get('/health')
self.assertEqual(response.status_code, 200)
self.assertEqual(response.json['status'], 'healthy')
def test_send_email_missing_fields(self):
response = self.client.post('/send', json={})
self.assertEqual(response.status_code, 400)
def test_send_email_success(self):
response = self.client.post('/send', json={
'to': 'test@example.com',
'subject': 'Test',
'body': 'Test body'
})
self.assertEqual(response.status_code, 200)
if __name__ == '__main__':
unittest.main()
Run with python test_app.py. Much simpler than Django’s test framework.
Deployment
I deploy with Gunicorn behind nginx:
# Install Gunicorn
pip install gunicorn
# Run with 4 workers
gunicorn -w 4 -b 0.0.0.0:5000 app:app
Nginx config:
upstream email_service {
server 127.0.0.1:5000;
}
server {
listen 80;
server_name email-service.internal;
location / {
proxy_pass http://email_service;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
For process management, I use Supervisor:
[program:email-service]
command=/usr/local/bin/gunicorn -w 4 -b 127.0.0.1:5000 app:app
directory=/opt/email-service
user=www-data
autostart=true
autorestart=true
stderr_logfile=/var/log/email-service/err.log
stdout_logfile=/var/log/email-service/out.log
Monitoring
I added Prometheus metrics:
from flask import Response
from prometheus_client import Counter, Histogram, generate_latest
emails_sent = Counter('emails_sent_total', 'Total emails sent')
emails_failed = Counter('emails_failed_total', 'Total emails failed')
request_duration = Histogram('request_duration_seconds', 'Request duration')
@app.route('/metrics')
def metrics():
return Response(generate_latest(), mimetype='text/plain')
@app.route('/send', methods=['POST'])
def send_email():
with request_duration.time():
# ... existing code ...
try:
send_smtp_email(...)
emails_sent.inc()
except Exception as e:
emails_failed.inc()
raise
Now Prometheus can scrape metrics and alert on failures.
Challenges and Lessons
What worked well:
- Fast development - Built the service in 3 days
- Easy to understand - New team members get it immediately
- Independent deployment - No more 30-minute deploys
- Focused responsibility - Does one thing well
What was hard:
- Service discovery - Hardcoded URLs are fragile
- Distributed debugging - Tracing requests across services is harder
- Data consistency - No more database transactions across features
- Operational overhead - Now managing multiple services
What I’d do differently:
- Add a message queue - RabbitMQ or Redis for async communication
- Use service discovery - Consul or etcd instead of hardcoded URLs
- Implement circuit breakers - Prevent cascading failures
- Add distributed tracing - Zipkin or Jaeger for request tracking
Python 2 vs Python 3
I’m still using Python 2.7 in production (Python 3.5 just came out). The main differences I care about:
# Python 2
print "Hello"
1 / 2 # Returns 0 (integer division)
# Python 3
print("Hello")
1 / 2 # Returns 0.5 (float division)
For new services, I’m starting to use Python 3.5. The async/await syntax looks promising, but I haven’t tried it yet.
Comparison with Go
I’ve been learning Go on the side. For microservices, Go has advantages:
- Single binary - No virtualenv or dependencies
- Fast startup - Milliseconds vs seconds
- Low memory - 10MB vs 50MB+ for Python
- Built-in concurrency - Goroutines vs threading
But Python is faster to write and has better libraries. For now, I’m sticking with Python for services that aren’t performance-critical.
Conclusion
Flask is perfect for microservices. It’s minimal, flexible, and gets out of your way.
Breaking up the monolith is working well. The email service is simpler, easier to test, and can be deployed independently. We’re planning to extract 3-4 more services this year.
Key takeaways:
- Start with the simplest service first
- Keep services small and focused
- Use environment variables for configuration
- Add monitoring from day one
- Plan for failure (circuit breakers, retries)
Microservices aren’t a silver bullet - they add operational complexity. But for our growing team, the benefits outweigh the costs.
Next up: extracting the payment service. That one will be more challenging.