Securing Kubernetes with Network Policies: A Practical Guide
We recently had a security audit that revealed a scary truth: any pod in our Kubernetes cluster could talk to any other pod. A compromised frontend container could directly access our database pods.
I spent the last month implementing network policies to enforce zero-trust networking. Here’s what I learned about securing Kubernetes at the network layer.
Table of Contents
The Problem: Default Allow-All
By default, Kubernetes allows all pod-to-pod communication. This is convenient for development but dangerous in production:
┌─────────────┐ ┌──────────────┐
│ Frontend │────────▶│ Database │
│ (nginx) │ │ (postgresql) │
└─────────────┘ └──────────────┘
│
│ Should NOT be allowed!
│
▼
┌──────────────┐
│ Payment │
│ Service │
└──────────────┘
If an attacker compromises the frontend, they can access everything. We need to restrict traffic to only what’s necessary.
Network Policies: The Solution
Network Policies are Kubernetes resources that control traffic between pods. They work like firewall rules.
Basic structure:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
namespace: default
spec:
podSelector: {} # Applies to all pods
policyTypes:
- Ingress
- Egress
This policy denies all ingress and egress traffic. It’s a good starting point - deny everything, then explicitly allow what’s needed.
Prerequisites: CNI Plugin Support
Network Policies require a CNI plugin that supports them. Not all do!
Supported:
- Calico ✓
- Cilium ✓
- Weave Net ✓
Not supported:
- Flannel ✗ (default in many clusters)
- AWS VPC CNI ✗ (default on EKS)
We were using Flannel. I had to migrate to Calico:
# Remove Flannel
kubectl delete -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# Install Calico
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
This was nerve-wracking in production. I tested extensively in staging first. The migration went smoothly, but plan for a maintenance window.
Pattern 1: Default Deny
Start by denying all traffic:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Apply this to every namespace. Now nothing works - which is good! You’ll explicitly allow only what’s needed.
Pattern 2: Allow Ingress from Specific Pods
Allow frontend to access backend:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-allow-frontend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
This allows:
- Pods with label
app=frontend - To access pods with label
app=backend - On port 8080 only
Pattern 3: Allow Ingress from Specific Namespaces
Allow monitoring tools to scrape metrics:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-monitoring
namespace: production
spec:
podSelector:
matchLabels:
metrics: "true"
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- protocol: TCP
port: 9090
This allows any pod in the monitoring namespace to access pods with metrics=true label.
Pattern 4: Allow Egress to External Services
Allow pods to access external APIs:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-external-api
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress
egress:
- to:
- namespaceSelector: {} # Allow to any namespace
- to:
- podSelector: {} # Allow to any pod
- to: # Allow to external IPs
- ipBlock:
cidr: 0.0.0.0/0
except:
- 169.254.169.254/32 # Block AWS metadata service
ports:
- protocol: TCP
port: 443
This allows HTTPS to external services but blocks the AWS metadata endpoint (security best practice).
Pattern 5: Allow DNS
DNS is critical - without it, pods can’t resolve service names. Always allow DNS:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
- podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
I forgot this initially and spent an hour debugging why nothing worked. DNS is easy to overlook!
Real-World Example: Three-Tier App
Here’s a complete setup for a web app with frontend, backend, and database:
# Deny all traffic by default
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# Allow DNS for all pods
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
# Allow ingress controller to reach frontend
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-allow-ingress
namespace: production
spec:
podSelector:
matchLabels:
app: frontend
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 80
# Allow frontend to call backend
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-allow-frontend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
# Allow backend to access database
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: database-allow-backend
namespace: production
spec:
podSelector:
matchLabels:
app: postgres
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: backend
ports:
- protocol: TCP
port: 5432
# Allow backend to call external APIs
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-allow-external
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 169.254.169.254/32
ports:
- protocol: TCP
port: 443
This creates a zero-trust network:
- Frontend only accepts traffic from ingress controller
- Backend only accepts traffic from frontend
- Database only accepts traffic from backend
- Backend can call external HTTPS APIs
- All pods can use DNS
Testing Network Policies
Testing is critical. I use a debug pod to verify policies:
# Create a debug pod
kubectl run debug --image=nicolaka/netshoot -it --rm
# Try to connect to a service
curl http://backend:8080
# Should fail if policy blocks it
# Try from a pod with correct labels
kubectl run frontend --image=nicolaka/netshoot --labels="app=frontend" -it --rm
curl http://backend:8080
# Should succeed if policy allows it
I also wrote automated tests:
#!/bin/bash
# Test that frontend can reach backend
kubectl run test-frontend --image=curlimages/curl --labels="app=frontend" --rm -it -- \
curl -s -o /dev/null -w "%{http_code}" http://backend:8080
# Should return 200
# Test that random pod cannot reach backend
kubectl run test-random --image=curlimages/curl --rm -it -- \
curl -s -o /dev/null -w "%{http_code}" http://backend:8080 --max-time 5
# Should timeout or return error
Common Pitfalls
1. Forgetting DNS
Without DNS egress, pods can’t resolve service names. Always allow UDP port 53 to kube-dns.
2. Blocking health checks
Kubernetes health checks come from the node, not from pods. Allow them:
ingress:
- from:
- ipBlock:
cidr: 10.0.0.0/8 # Your cluster CIDR
ports:
- protocol: TCP
port: 8080
3. Namespace labels
Namespace selectors require labels on namespaces:
kubectl label namespace monitoring name=monitoring
I forgot this and spent an hour debugging why namespace selectors didn’t work.
4. Policy order doesn’t matter
Network Policies are additive. If any policy allows traffic, it’s allowed. You can’t override an allow with a deny.
5. No egress = no response
If you only specify ingress, egress is implicitly allowed. But if you specify egress, you must explicitly allow response traffic.
Monitoring and Debugging
I use Calico’s logging to debug policy violations:
# Enable logging on Calico
kubectl patch felixconfiguration default --type merge -p '{"spec":{"logSeverityScreen":"Info"}}'
# View logs
kubectl logs -n kube-system -l k8s-app=calico-node
For better visibility, I deployed Cilium Hubble (works with Calico too):
kubectl apply -f https://raw.githubusercontent.com/cilium/hubble/master/install/kubernetes/quick-install.yaml
# View network flows
hubble observe --namespace production
This shows all allowed and denied connections in real-time. Invaluable for debugging.
Performance Impact
I was worried about performance overhead. After implementing policies across 50+ services:
- Latency: No measurable increase
- Throughput: No measurable decrease
- CPU usage: Slight increase (~2%) on nodes
The overhead is minimal. The security benefits far outweigh the cost.
Gradual Rollout Strategy
Don’t implement network policies all at once. Here’s how I rolled them out:
- Week 1: Deploy policies in audit mode (Calico supports this)
- Week 2: Review logs, fix issues
- Week 3: Enable enforcement in staging
- Week 4: Enable enforcement in production, one namespace at a time
This gradual approach prevented outages.
Conclusion
Network Policies are essential for production Kubernetes security. They implement zero-trust networking at the pod level.
Key takeaways:
- Start with default deny-all
- Explicitly allow only necessary traffic
- Always allow DNS
- Test thoroughly before enforcing
- Use a CNI plugin that supports policies (Calico, Cilium, Weave)
- Monitor and log policy violations
The initial setup takes time, but it’s worth it. We now have defense in depth - even if a pod is compromised, the attacker can’t move laterally.
Network Policies should be part of every Kubernetes security strategy. Don’t run production workloads without them.