Setting Up Prometheus Monitoring for Microservices
We set up Prometheus for monitoring our microservices. Here’s how.
Why Prometheus?
- Pull-based metrics collection
- Powerful query language (PromQL)
- Built-in alerting
- Great for dynamic environments (Kubernetes)
- Open source
Architecture
Microservices → Prometheus → Grafana
↓
Alertmanager
Installation
# Prometheus
docker run -d \
-p 9090:9090 \
-v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus
# Grafana
docker run -d \
-p 3000:3000 \
grafana/grafana
Configuration
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'api'
static_configs:
- targets: ['api:8080']
- job_name: 'worker'
static_configs:
- targets: ['worker:8080']
Instrumenting Applications
Go
import (
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
)
var (
httpRequests = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "http_requests_total",
Help: "Total HTTP requests",
},
[]string{"method", "endpoint", "status"},
)
)
func init() {
prometheus.MustRegister(httpRequests)
}
func handler(w http.ResponseWriter, r *http.Request) {
httpRequests.WithLabelValues(r.Method, r.URL.Path, "200").Inc()
// Handle request
}
func main() {
http.Handle("/metrics", promhttp.Handler())
http.ListenAndServe(":8080", nil)
}
Java
import io.prometheus.client.Counter;
import io.prometheus.client.exporter.HTTPServer;
public class MyApp {
static final Counter requests = Counter.build()
.name("http_requests_total")
.help("Total HTTP requests")
.labelNames("method", "endpoint", "status")
.register();
public static void main(String[] args) throws Exception {
HTTPServer server = new HTTPServer(8080);
// Your app logic
requests.labels("GET", "/api/users", "200").inc();
}
}
Queries
# Request rate
rate(http_requests_total[5m])
# Error rate
rate(http_requests_total{status=~"5.."}[5m])
# 95th percentile latency
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
# Memory usage
process_resident_memory_bytes
Grafana Dashboards
Created dashboards for:
- Request rate
- Error rate
- Latency (p50, p95, p99)
- CPU/Memory usage
- Database connections
Alerting
# alert.rules
groups:
- name: api
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 5m
annotations:
summary: "High error rate on {{ $labels.instance }}"
The Results
- Full visibility into all services
- Alerts before users notice issues
- Easy troubleshooting with metrics
Prometheus is now essential to our operations.
Questions? Ask away!