Kubernetes: How to Use Affinity

Affinity is a great feature in Kubernetes to assign pods to nodes based on labels. In my case, I have a hybrid Kubernetes cluster with half nodes are of X86 architecture and other half of ARM architecture, and I need to deploy the X86 only containers to the X86 nodes. Of course I can build multi-arch containers to get rid of this restriction too, but let’s see how Affinity works first.

All the nodes have labels of their architecture, and those labels can be printed out like this

# the key in jsonpath is to escape the dot "." and slash "/" in the key names, in this example, kubernetes.io/arch
k get node -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.labels.kubernetes\.io\/arch}{"\n"}{end}'
kmaster	arm
knode1	arm
knode2	arm
knode3	amd64
knode4	amd64
knode5	amd64

To deploy a Pod or Deployment, StatefulSet, etc, the Affinity should be put into the pod’s spec, eg.

# this is only a partial example of a deployment with affinity
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
spec:
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                - key: kubernetes.io/arch
                  operator: In
                  values:
                    - amd64

The Deployment above will be scheduled onto a node running on X86 architecture.

Note: requiredDuringSchedulingIgnoredDuringExecution is a hard requirement and if it’s not met the pod won’t be deployed. If it’s a soft requirement, preferredDuringSchedulingIgnoredDuringExecution should be used instead.

🙂

Kubernetes and GitOps with Flux CD V2.0

GitOps could be the next big thing in cloud automation so I’d give it a try with my in house hybrid Kubernetes cluster. I was recommended to try Flux CD and there’s a good reference project initiated by my colleage: k8s-gitops.

However, in order to fully understand how to use Flux CD, I chose to start from scratch. Following the official instructions it didn’t take me long to fully enable GitOps on my cluster. Here’s how I did it on my laptop running Ubuntu:

First, create a GitHub PAT(Personal Access Token) with full repository permissions. Details can be read here. Also make sure you can create a private repository in GitHub (everyone gets 1 for free). Export GitHub username and PAT as environment variables as following:

export GITHUB_TOKEN=<your-token>
export GITHUB_USER=<your-username>

Latest Flux2 CLI can be downloaded here. You can also use the installation script from Flux if you fully trust it:

curl -s https://toolkit.fluxcd.io/install.sh | sudo bash

From this step onward, you will need access to a Kubernetes cluster, eg. kubectl cluster-info command works and returns cluster information. Check Flux2’s prerequisites with:

flux check --pre
► checking prerequisites
✔ kubectl 1.18.6 >=1.18.0
✔ Kubernetes 1.18.9 >=1.16.0
✔ prerequisites checks passed

Then the Flux2 command below can be executed to bootstrap a private GitHub repository flux-gitops using your GitHub PAT and the repository will be your cluster-as-code command center for GitOps practice, also the CRD(Custom Resource Definition) and controllers for Flux2 will be installed to the current cluster

flux bootstrap github \
  --owner=$GITHUB_USER \
  --repository=flux-gitops \
  --branch=main \
  --path=home-cluster \
  --personal

In the generated flux-gitops repository, the file structure looks like

flux-gitops
  - home-cluster
    - flux-system

Now you can simply add Helm charts or Kustomization templates into this repository and the changes will be applied to the cluster automatically. The following commands will create a simple namespace in the cluster, then register it with Flux2. After the changes pushed to GitHub, Flux2 controllers will apply the changes and create the new namespace.

cd flux-gitops/home-cluster
mkdir my-test
cd my-test
kustomize create
kubectl create namespace my-test --dry-run=client -o yaml > ns.yaml
kustomize edit add resource ns.yaml
cd .. # in home-cluster
flux create kustomization my-test --source=flux-system --path=home-cluster/my-test --prune=true --validation=client --interval=2m --export > my-test.yaml
# check-in everything to test GitOps
git add my-test my-test.yaml
git commit -m "Added my-test"
git push

Then you use a watch command to see how the new change get applied

watch flux get kustomizations
NAME                    READY   MESSAGE                                                         REVISION                                        SUSPENDED
flux-system             True    Applied revision: main/529288eed6105909a97f0d3539bc68e5e934418a main/529288eed6105909a97f0d3539bc68e5e934418a   False
my-test                 True    Applied revision: main/529288eed6105909a97f0d3539bc68e5e934418a main/529288eed6105909a97f0d3539bc68e5e934418a   False

That’s it, the Flux2 Hello-world. 🙂

Hybrid Kubernetes Cluster (X86 + ARM)

My old ASUS 15″ laptop bought in 2014. It has a sub-woofer!

The one in the picture was my old laptop, then my daughter’s for a few years. Now she got a nice new 2-in-1 ultra book the school asked us parents to buy, this clunky one was gathering dust on shelves. I tried to sell it but got no one’s attention despite it has got i7 CPU and 16GB of memory.

So I was thinking, this has same amount of memory as 4 x Raspberry PI 4, but I probably won’t be able to sell it to pay for the PIs. Why not just use it as a glorified Raspberry PI? I measured its power consumption and to my surprise, this one with gen 4 i7 only asks for 10W when idle and screen off, not bad at all. In comparison 4 x PI 4 probably need 20W to stay up.

Let’s do it then!

I re-installed the OS with Ubuntu Server 20.04 LTS and prepared it for kubeadm to run with my ansible playbooks here. Since I’ve updated my playbook to let it handle both Raspbian on ARM and Ubuntu on X86_64 it was fairly easy to get the laptop(calling it knode3 afterwards) ready.

I haven’t locked down versions in my playbook so the installed docker and kubeadm are vastly newer than the ones in my existing Raspberry PI cluster so there will be some compatibility issues if I don’t match them. I used the following commands to downgrade docker and kubeadm:

apt remove docker-ce --purge
apt install docker-ce=5:19.03.9~3-0~ubuntu-focal
apt install kubeadm=1.18.13-00

The kubeadm join command I ran earlier on nodes didn’t work anymore and it complained about the token. Of course the token has expired after a year or so. Here’s command to issue a new token from the master node

kubeadm token create
xxxxxx.xxx...

Grab the new token and replace the one in the join command

kubeadm join <master IP>:6443 --token <new token xxx> --discovery-token-ca-cert-hash sha256:<hash didn't change>

For debugging purpose I ran journalctl -f in the other tab of the terminal to see the output. When the join command finished, I ran kubectl get nodes in my local terminal session to verify the result

kubectl get node
NAME      STATUS   ROLES    AGE    VERSION
kmaster   Ready    master   89d    v1.18.8
knode1    Ready    <none>   89d    v1.18.8
knode2    Ready    <none>   89d    v1.18.8
knode3    Ready    <none>   3m     v1.20.1

The kubernetes version is a bit newer, maybe I will upgrade the old nodes quickly. Now I have a node which has 16GB of memory 🙂

PS. to keep the laptop running when the lid is closed I used this tweak.

Renew Certificates Used in Kubeadm Kubernetes Cluster

It’s been more than a year since I built my Kubernetes cluster with some Raspberry PIs. There was a few times that I need to power down everything to let electricians do their work and the cluster came back online and seemed to be Ok afterwards, given that I didn’t shutdown the PIs properly at all.

Recently I found that I lost contact with the cluster, it looked like:

$ kubectl get node
The connection to the server 192.168.x.x:6443 was refused - did you specify the right host or port?

The first thought came to my mind is the cluster must have got hacked since it’s on auto-pilot for months. But I still could ssh into the master node so it’s not that bad. I saw the error logs from kubelet.service:

Sep 23 15:58:05 kmaster kubelet[1233]: E0923 15:58:05.341773    1233 bootstrap.go:263] Part of the existing bootstrap client certificate is expired: 2020-09-15 10:40:36 +0000 UTC

That makes perfect sense! The anniversary was just a few days ago and the certificate seems only last a year. Here’s the StackOverflow answer which I found very helpful for this issue.

I tried the following command in the master node and the API server was back to life

$ cd /etc/kubernetes/pki/
$ mv {apiserver.crt,apiserver-etcd-client.key,apiserver-kubelet-client.crt,front-proxy-ca.crt,front-proxy-client.crt,front-proxy-client.key,front-proxy-ca.key,apiserver-kubelet-client.key,apiserver.key,apiserver-etcd-client.crt} /tmp/backup
$ kubeadm init phase certs all --apiserver-advertise-address <IP>
$ cd /etc/kubernetes/
$ mv {admin.conf,controller-manager.conf,kubelet.conf,scheduler.conf} /tmp/backup
$ kubeadm init phase kubeconfig all
$ systemctl restart kubelet.service

I’m not sure if all the new certs will be distributed to nodes automatically but at least the API didn’t complain anymore. I might do a kubeadm upgrade soon.

$ kubectl get node
NAME      STATUS     ROLES    AGE    VERSION
kmaster   NotReady   master   372d   v1.15.3
knode1    NotReady   <none>   372d   v1.15.3
knode2    NotReady   <none>   372d   v1.15.3

EDIT: After the certs are renewed, kubelet service couldn’t authenticate anymore and nodes appeared NotReady. This can be fixed by delete the obsolete kubelet client certificate by

$ ls /var/lib/kubelet/pki -lht
total 28K
-rw------- 1 root root 1.1K Sep 23 19:12 kubelet-client-2020-09-23-19-12-52.pem
lrwxrwxrwx 1 root root   59 Sep 23 19:12 kubelet-client-current.pem -> /var/lib/kubelet/pki/kubelet-client-2020-09-23-19-12-52.pem
-rw------- 1 root root 2.7K Sep 23 19:12 kubelet-client-2020-09-23-19-12-51.pem
-rw------- 1 root root 1.1K Jun 17 00:56 kubelet-client-2020-06-17-00-56-59.pem
-rw------- 1 root root 1.1K Sep 16  2019 kubelet-client-2019-09-16-20-41-53.pem
-rw------- 1 root root 2.7K Sep 16  2019 kubelet-client-2019-09-16-20-40-40.pem
-rw-r--r-- 1 root root 2.2K Sep 16  2019 kubelet.crt
-rw------- 1 root root 1.7K Sep 16  2019 kubelet.key
$ rm /var/lib/kubelet/pki/kubelet-client-current.pem
$ systemctl restart kubelet.service

🙂