I happened to notice that all 3 pods serving this blog in my Kubernetes cluster were allocated to a same node. I thought Kubernetes will try its best to shuffle pods of a deployment into different nodes by default but guess I expected too much. Note the
$ k get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES wordpress-867ccc444b-rtqn5 3/3 Running 0 17d 10.246.176.244 knode3 <none> <none> wordpress-867ccc444b-t24ms 3/3 Running 0 17d 10.246.176.250 knode3 <none> <none> wordpress-867ccc444b-z4p2r 3/3 Running 0 17d 10.246.176.242 knode3 <none> <none>
Have you spotted the problem? In this scenario if
knode3 tripped all 3 pods will be down and my blog will go 503!
To explicitly ask Kubernetes to schedule pods to different nodes, I added
podAntiAffinity so each pod with label app=wordpress will try to stay away from the other. Here’s the snippet for the deployment:
apiVersion: apps/v1 kind: Deployment metadata: name: wordpress spec: replicas: 3 selector: matchLabels: app: wordpress template: metadata: labels: app: wordpress spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - wordpress topologyKey: "kubernetes.io/hostname" ...
After a re-deploy, now the pods are scheduled in a more resilient way:
$ k get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES wordpress-7856864fd5-ksvsk 3/3 Running 0 39s 10.246.176.248 knode3 <none> <none> wordpress-7856864fd5-nltd4 3/3 Running 0 39s 10.246.12.75 knode6 <none> <none> wordpress-7856864fd5-xlrnv 3/3 Running 0 39s 10.246.81.32 knode5 <none> <none>