We recently had a security audit that revealed a scary truth: any pod in our Kubernetes cluster could talk to any other pod. A compromised frontend container could directly access our database pods.

I spent the last month implementing network policies to enforce zero-trust networking. Here’s what I learned about securing Kubernetes at the network layer.

Table of Contents

The Problem: Default Allow-All

By default, Kubernetes allows all pod-to-pod communication. This is convenient for development but dangerous in production:

┌─────────────┐         ┌──────────────┐
│  Frontend   │────────▶│   Database   │
│   (nginx)   │         │ (postgresql) │
└─────────────┘         └──────────────┘

      │ Should NOT be allowed!


┌──────────────┐
│   Payment    │
│   Service    │
└──────────────┘

If an attacker compromises the frontend, they can access everything. We need to restrict traffic to only what’s necessary.

Network Policies: The Solution

Network Policies are Kubernetes resources that control traffic between pods. They work like firewall rules.

Basic structure:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
  namespace: default
spec:
  podSelector: {}  # Applies to all pods
  policyTypes:
  - Ingress
  - Egress

This policy denies all ingress and egress traffic. It’s a good starting point - deny everything, then explicitly allow what’s needed.

Prerequisites: CNI Plugin Support

Network Policies require a CNI plugin that supports them. Not all do!

Supported:

  • Calico ✓
  • Cilium ✓
  • Weave Net ✓

Not supported:

  • Flannel ✗ (default in many clusters)
  • AWS VPC CNI ✗ (default on EKS)

We were using Flannel. I had to migrate to Calico:

# Remove Flannel
kubectl delete -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

# Install Calico
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

This was nerve-wracking in production. I tested extensively in staging first. The migration went smoothly, but plan for a maintenance window.

Pattern 1: Default Deny

Start by denying all traffic:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Apply this to every namespace. Now nothing works - which is good! You’ll explicitly allow only what’s needed.

Pattern 2: Allow Ingress from Specific Pods

Allow frontend to access backend:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-allow-frontend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

This allows:

  • Pods with label app=frontend
  • To access pods with label app=backend
  • On port 8080 only

Pattern 3: Allow Ingress from Specific Namespaces

Allow monitoring tools to scrape metrics:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-monitoring
  namespace: production
spec:
  podSelector:
    matchLabels:
      metrics: "true"
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: monitoring
    ports:
    - protocol: TCP
      port: 9090

This allows any pod in the monitoring namespace to access pods with metrics=true label.

Pattern 4: Allow Egress to External Services

Allow pods to access external APIs:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-external-api
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector: {}  # Allow to any namespace
  - to:
    - podSelector: {}  # Allow to any pod
  - to:  # Allow to external IPs
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 169.254.169.254/32  # Block AWS metadata service
    ports:
    - protocol: TCP
      port: 443

This allows HTTPS to external services but blocks the AWS metadata endpoint (security best practice).

Pattern 5: Allow DNS

DNS is critical - without it, pods can’t resolve service names. Always allow DNS:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    - podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53

I forgot this initially and spent an hour debugging why nothing worked. DNS is easy to overlook!

Real-World Example: Three-Tier App

Here’s a complete setup for a web app with frontend, backend, and database:

# Deny all traffic by default
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

# Allow DNS for all pods
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: UDP
      port: 53

# Allow ingress controller to reach frontend
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-allow-ingress
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: frontend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    ports:
    - protocol: TCP
      port: 80

# Allow frontend to call backend
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-allow-frontend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

# Allow backend to access database
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: database-allow-backend
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: postgres
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: backend
    ports:
    - protocol: TCP
      port: 5432

# Allow backend to call external APIs
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-allow-external
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Egress
  egress:
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 169.254.169.254/32
    ports:
    - protocol: TCP
      port: 443

This creates a zero-trust network:

  • Frontend only accepts traffic from ingress controller
  • Backend only accepts traffic from frontend
  • Database only accepts traffic from backend
  • Backend can call external HTTPS APIs
  • All pods can use DNS

Testing Network Policies

Testing is critical. I use a debug pod to verify policies:

# Create a debug pod
kubectl run debug --image=nicolaka/netshoot -it --rm

# Try to connect to a service
curl http://backend:8080
# Should fail if policy blocks it

# Try from a pod with correct labels
kubectl run frontend --image=nicolaka/netshoot --labels="app=frontend" -it --rm
curl http://backend:8080
# Should succeed if policy allows it

I also wrote automated tests:

#!/bin/bash

# Test that frontend can reach backend
kubectl run test-frontend --image=curlimages/curl --labels="app=frontend" --rm -it -- \
  curl -s -o /dev/null -w "%{http_code}" http://backend:8080

# Should return 200

# Test that random pod cannot reach backend
kubectl run test-random --image=curlimages/curl --rm -it -- \
  curl -s -o /dev/null -w "%{http_code}" http://backend:8080 --max-time 5

# Should timeout or return error

Common Pitfalls

1. Forgetting DNS

Without DNS egress, pods can’t resolve service names. Always allow UDP port 53 to kube-dns.

2. Blocking health checks

Kubernetes health checks come from the node, not from pods. Allow them:

ingress:
- from:
  - ipBlock:
      cidr: 10.0.0.0/8  # Your cluster CIDR
  ports:
  - protocol: TCP
    port: 8080

3. Namespace labels

Namespace selectors require labels on namespaces:

kubectl label namespace monitoring name=monitoring

I forgot this and spent an hour debugging why namespace selectors didn’t work.

4. Policy order doesn’t matter

Network Policies are additive. If any policy allows traffic, it’s allowed. You can’t override an allow with a deny.

5. No egress = no response

If you only specify ingress, egress is implicitly allowed. But if you specify egress, you must explicitly allow response traffic.

Monitoring and Debugging

I use Calico’s logging to debug policy violations:

# Enable logging on Calico
kubectl patch felixconfiguration default --type merge -p '{"spec":{"logSeverityScreen":"Info"}}'

# View logs
kubectl logs -n kube-system -l k8s-app=calico-node

For better visibility, I deployed Cilium Hubble (works with Calico too):

kubectl apply -f https://raw.githubusercontent.com/cilium/hubble/master/install/kubernetes/quick-install.yaml

# View network flows
hubble observe --namespace production

This shows all allowed and denied connections in real-time. Invaluable for debugging.

Performance Impact

I was worried about performance overhead. After implementing policies across 50+ services:

  • Latency: No measurable increase
  • Throughput: No measurable decrease
  • CPU usage: Slight increase (~2%) on nodes

The overhead is minimal. The security benefits far outweigh the cost.

Gradual Rollout Strategy

Don’t implement network policies all at once. Here’s how I rolled them out:

  1. Week 1: Deploy policies in audit mode (Calico supports this)
  2. Week 2: Review logs, fix issues
  3. Week 3: Enable enforcement in staging
  4. Week 4: Enable enforcement in production, one namespace at a time

This gradual approach prevented outages.

Conclusion

Network Policies are essential for production Kubernetes security. They implement zero-trust networking at the pod level.

Key takeaways:

  1. Start with default deny-all
  2. Explicitly allow only necessary traffic
  3. Always allow DNS
  4. Test thoroughly before enforcing
  5. Use a CNI plugin that supports policies (Calico, Cilium, Weave)
  6. Monitor and log policy violations

The initial setup takes time, but it’s worth it. We now have defense in depth - even if a pod is compromised, the attacker can’t move laterally.

Network Policies should be part of every Kubernetes security strategy. Don’t run production workloads without them.