Go Context Patterns: Lessons from Production
I’ve been writing Go microservices for about two years now, and context.Context is one of those features that seemed simple at first but revealed layers of complexity as I used it in production.
We recently had a production incident where a cascading timeout failure took down our entire service mesh. The root cause? Improper context handling. This post covers the patterns I’ve learned (often the hard way).
Table of Contents
What is Context?
If you’re new to Go, context.Context is an interface that carries deadlines, cancellation signals, and request-scoped values across API boundaries and goroutines.
The basic interface:
type Context interface {
Deadline() (deadline time.Time, ok bool)
Done() <-chan struct{}
Err() error
Value(key interface{}) interface{}
}
It was added in Go 1.7 (2016) and has become the standard way to handle request lifecycle in servers.
Pattern 1: Always Accept Context as First Parameter
This is the official convention, but I see it violated constantly:
// Bad
func FetchUser(userID string, ctx context.Context) (*User, error) {
// ...
}
// Good
func FetchUser(ctx context.Context, userID string) (*User, error) {
// ...
}
Why does this matter? Consistency. When every function follows the same pattern, code is easier to read and refactor. Our team enforces this with a linter.
Pattern 2: Propagate Context Through the Call Chain
This seems obvious, but it’s easy to forget when you’re deep in a call stack:
func HandleRequest(ctx context.Context, req *Request) error {
// Bad: creating new context loses parent's cancellation
newCtx := context.Background()
return processRequest(newCtx, req)
// Good: propagate the context
return processRequest(ctx, req)
}
We had a bug where a background job was using context.Background() instead of the request context. When the HTTP request was cancelled, the background job kept running, wasting resources.
Pattern 3: Set Timeouts at the Right Level
This is where our production incident happened. We had timeouts at every level:
// API Gateway: 30s timeout
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()
// Service A: 25s timeout
ctx, cancel = context.WithTimeout(ctx, 25*time.Second)
defer cancel()
// Service B: 20s timeout
ctx, cancel = context.WithTimeout(ctx, 20*time.Second)
defer cancel()
// Database: 15s timeout
ctx, cancel = context.WithTimeout(ctx, 15*time.Second)
defer cancel()
The problem: when Service B hit its 20s timeout, it cancelled the context. But Service A was still waiting, and it interpreted the cancellation as a failure, triggering retries. This cascaded up the chain.
Better approach - set timeout once at the top level:
// API Gateway: 30s timeout for entire request
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()
// Services just check if context is done
func ServiceA(ctx context.Context) error {
select {
case <-ctx.Done():
return ctx.Err()
default:
// Do work
}
}
For operations that need their own timeout, use a child context:
func FetchWithRetry(ctx context.Context, url string) (*Response, error) {
for i := 0; i < 3; i++ {
// Each attempt gets 5s, but respects parent timeout
attemptCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
resp, err := fetch(attemptCtx, url)
if err == nil {
return resp, nil
}
// Check if parent context is done
if ctx.Err() != nil {
return nil, ctx.Err()
}
}
return nil, errors.New("all retries failed")
}
Pattern 4: Always Defer cancel()
Even if you don’t think you need it:
ctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel() // Always call this
result, err := doWork(ctx)
return result, err
Why? The context holds resources (timers, goroutines). If you don’t cancel, they leak. I’ve seen services slowly consume memory because of missing defer cancel() calls.
Pattern 5: Use Context Values Sparingly
Context values are controversial. The official docs say to use them only for request-scoped data that crosses API boundaries.
Here’s what I use them for:
type contextKey string
const (
requestIDKey contextKey = "request_id"
userIDKey contextKey = "user_id"
)
func WithRequestID(ctx context.Context, id string) context.Context {
return context.WithValue(ctx, requestIDKey, id)
}
func GetRequestID(ctx context.Context) string {
if id, ok := ctx.Value(requestIDKey).(string); ok {
return id
}
return ""
}
I use this for logging:
func LogError(ctx context.Context, msg string, err error) {
log.Printf("[%s] %s: %v", GetRequestID(ctx), msg, err)
}
What I DON’T use context values for:
- Configuration (use explicit parameters)
- Optional parameters (use explicit parameters or options pattern)
- Anything that affects business logic
Pattern 6: Handle Cancellation in Long Operations
If you have a long-running operation, check context periodically:
func ProcessBatch(ctx context.Context, items []Item) error {
for i, item := range items {
// Check every 100 items
if i%100 == 0 {
select {
case <-ctx.Done():
return ctx.Err()
default:
}
}
if err := processItem(item); err != nil {
return err
}
}
return nil
}
For I/O operations, pass the context to the I/O function:
func FetchData(ctx context.Context, url string) ([]byte, error) {
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return nil, err
}
// This makes the request respect context cancellation
req = req.WithContext(ctx)
resp, err := http.DefaultClient.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
return ioutil.ReadAll(resp.Body)
}
Pattern 7: Don’t Store Context in Structs
This is a common mistake:
// Bad
type Server struct {
ctx context.Context
}
func (s *Server) HandleRequest(req *Request) error {
// Using stored context
return s.process(s.ctx, req)
}
Why is this bad? Contexts are request-scoped. Storing them in long-lived structs defeats the purpose.
Instead, pass context as a parameter:
// Good
type Server struct {
config *Config
}
func (s *Server) HandleRequest(ctx context.Context, req *Request) error {
return s.process(ctx, req)
}
Real-World Example: HTTP Server with Context
Here’s how I structure HTTP handlers:
func (s *Server) handleUser(w http.ResponseWriter, r *http.Request) {
// Get context from request
ctx := r.Context()
// Add request ID for logging
requestID := generateID()
ctx = WithRequestID(ctx, requestID)
// Set timeout for this request
ctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
// Extract user ID from URL
userID := mux.Vars(r)["id"]
// Fetch user (respects context timeout and cancellation)
user, err := s.fetchUser(ctx, userID)
if err != nil {
if err == context.DeadlineExceeded {
http.Error(w, "Request timeout", http.StatusGatewayTimeout)
return
}
if err == context.Canceled {
// Client disconnected
return
}
LogError(ctx, "Failed to fetch user", err)
http.Error(w, "Internal error", http.StatusInternalServerError)
return
}
json.NewEncoder(w).Encode(user)
}
func (s *Server) fetchUser(ctx context.Context, userID string) (*User, error) {
// Check cache first
if user := s.cache.Get(userID); user != nil {
return user, nil
}
// Fetch from database with context
query := "SELECT * FROM users WHERE id = $1"
row := s.db.QueryRowContext(ctx, query, userID)
var user User
if err := row.Scan(&user.ID, &user.Name, &user.Email); err != nil {
return nil, err
}
s.cache.Set(userID, &user)
return &user, nil
}
Testing with Context
For tests, use context.Background() or create a context with timeout:
func TestFetchUser(t *testing.T) {
ctx := context.Background()
user, err := FetchUser(ctx, "user123")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if user.ID != "user123" {
t.Errorf("expected user123, got %s", user.ID)
}
}
func TestFetchUserTimeout(t *testing.T) {
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Millisecond)
defer cancel()
time.Sleep(10 * time.Millisecond) // Ensure timeout
_, err := FetchUser(ctx, "user123")
if err != context.DeadlineExceeded {
t.Errorf("expected DeadlineExceeded, got %v", err)
}
}
Conclusion
Context is one of Go’s most powerful features, but it requires discipline to use correctly. The patterns I’ve outlined here have saved us from multiple production issues.
Key takeaways:
- Always pass context as the first parameter
- Set timeouts at the right level (usually once at the top)
- Always defer cancel()
- Use context values only for request-scoped data
- Check context cancellation in long operations
- Never store context in structs
If you’re building Go microservices, invest time in getting context right. It’ll save you from cascading failures and resource leaks down the road.