Better Resilience for Kubernetes Pods


I happened to notice that all 3 pods serving this blog in my Kubernetes cluster were allocated to a same node. I thought Kubernetes will try its best to shuffle pods of a deployment into different nodes by default but guess I expected too much. Note the knode3 below

$ k get pods -o wide
NAME                            READY   STATUS      RESTARTS   AGE     IP               NODE       NOMINATED NODE   READINESS GATES
wordpress-867ccc444b-rtqn5      3/3     Running     0          17d     10.246.176.244   knode3     <none>           <none>
wordpress-867ccc444b-t24ms      3/3     Running     0          17d     10.246.176.250   knode3     <none>           <none>
wordpress-867ccc444b-z4p2r      3/3     Running     0          17d     10.246.176.242   knode3     <none>           <none>

Have you spotted the problem? In this scenario if knode3 tripped all 3 pods will be down and my blog will go 503!

To explicitly ask Kubernetes to schedule pods to different nodes, I added podAntiAffinity so each pod with label app=wordpress will try to stay away from the other. Here’s the snippet for the deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress
spec:
  replicas: 3
  selector:
    matchLabels:
      app: wordpress
  template:
    metadata:
      labels:
        app: wordpress
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - wordpress
                topologyKey: "kubernetes.io/hostname"
...

After a re-deploy, now the pods are scheduled in a more resilient way:

$ k get pods -o wide
NAME                            READY   STATUS      RESTARTS   AGE     IP               NODE     NOMINATED NODE   READINESS GATES
wordpress-7856864fd5-ksvsk      3/3     Running     0         39s      10.246.176.248   knode3   <none>           <none>
wordpress-7856864fd5-nltd4      3/3     Running     0         39s      10.246.12.75     knode6   <none>           <none>
wordpress-7856864fd5-xlrnv      3/3     Running     0         39s      10.246.81.32     knode5   <none>           <none>

🙂

,