We’ve been testing Istio for 2 months. Here’s what we learned.

What is Istio?

Istio is a service mesh for Kubernetes. It adds:

  • Traffic management
  • Security (mTLS)
  • Observability
  • Policy enforcement

Without changing your application code.

How It Works

Istio injects a sidecar proxy (Envoy) into each pod:

Pod
├── Your App (port 8080)
└── Envoy Proxy (intercepts all traffic)

All traffic goes through Envoy, which provides the features.

Installation

# Download Istio
curl -L https://git.io/getLatestIstio | sh-

# Install
kubectl apply -f install/kubernetes/istio-demo.yaml

# Enable sidecar injection
kubectl label namespace default istio-injection=enabled

Traffic Management

Canary Deployments

Deploy v2 to 10% of users:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp
  http:
  - match:
    - headers:
        user-type:
          exact: beta
    route:
    - destination:
        host: myapp
        subset: v2
  - route:
    - destination:
        host: myapp
        subset: v1
      weight: 90
    - destination:
        host: myapp
        subset: v2
      weight: 10

Circuit Breaking

Prevent cascading failures:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: myapp
spec:
  host: myapp
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 50
        maxRequestsPerConnection: 2
    outlierDetection:
      consecutiveErrors: 5
      interval: 30s
      baseEjectionTime: 30s

Retries and Timeouts

apiVersion: networking.istio.io/v1alpha3
kind:VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp
  http:
  - route:
    - destination:
        host: myapp
    timeout: 10s
    retries:
      attempts: 3
      perTryTimeout: 2s

Security

Mutual TLS

Automatic encryption between services:

apiVersion: "authentication.istio.io/v1alpha1"
kind: "Policy"
metadata:
  name: "default"
spec:
  peers:
  - mtls: {}

All service-to-service traffic is now encrypted. No code changes.

Authorization

apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRole
metadata:
  name: api-reader
spec:
  rules:
  - services: ["api.default.svc.cluster.local"]
    methods: ["GET"]
---
apiVersion: "rbac.istio.io/v1alpha1"
kind: ServiceRoleBinding
metadata:
  name: bind-api-reader
spec:
  subjects:
  - user: "cluster.local/ns/default/sa/frontend"
  roleRef:
    kind: ServiceRole
    name: "api-reader"

Observability

Distributed Tracing

Istio integrates with Jaeger:

kubectl apply -f install/kubernetes/addons/jaeger.yaml

See request flow across services. No instrumentation needed.

Metrics

Automatic Prometheus metrics:

  • Request rate
  • Error rate
  • Latency (p50, p95, p99)

Service Graph

Visualize service dependencies with Kiali:

kubectl apply -f install/kubernetes/addons/kiali.yaml

What We Like

1. Zero Code Changes

All features work without modifying apps. Just deploy to Istio-enabled namespace.

2. Powerful Traffic Management

Canary deployments, A/B testing, traffic splitting - all declarative.

3. Security by Default

mTLS between all services. No manual certificate management.

4. Observability

Distributed tracing and metrics out of the box.

What We Don’t Like

1. Complexity

Istio adds:

  • 10+ CRDs (Custom Resource Definitions)
  • 5+ new components (Pilot, Mixer, Citadel, etc.)
  • Sidecar in every pod

Lots of moving parts.

2. Resource Overhead

Each sidecar uses:

  • ~50MB memory
  • ~0.1 CPU

For 100 pods, that’s 5GB memory and 10 CPUs just for sidecars.

3. Debugging is Harder

When something breaks, is it:

  • Your app?
  • Envoy?
  • Istio control plane?
  • Kubernetes?

More layers = harder debugging.

4. Performance Impact

Sidecar adds latency:

  • Without Istio: 10ms
  • With Istio: 15ms

Not huge, but noticeable.

Our Decision

We’re not using Istio in production yet. Here’s why:

What We Need

  • Canary deployments
  • Circuit breaking
  • Observability

What We’re Using Instead

  • Canary: Helm + manual traffic splitting
  • Circuit breaking: Application-level (Hystrix, resilience4j)
  • Observability: Prometheus + Jaeger (with manual instrumentation)

This is simpler and we understand it better.

When Would We Use Istio?

If we had:

  • 50+ microservices (we have 20)
  • Complex traffic routing needs
  • Strict security requirements
  • Team dedicated to managing Istio

For now:

  • Our current setup works
  • Istio adds too much complexity
  • We’ll revisit in 6-12 months

Alternatives

Linkerd

Simpler than Istio, fewer features. Worth considering.

Consul Connect

If you’re already using Consul.

AWS App Mesh

If you’re on AWS and want managed service mesh.

The Verdict

Istio is powerful but complex. Great for large organizations with many microservices.

For smaller teams, the complexity might not be worth it.

We’re watching Istio closely. It’s maturing fast. Maybe in 2019.

Anyone using Istio in production? How’s it going?