An Canary Upgrade of Istio 1.9 to 1.11


Prerequisites: full Admin access to a Kubernetes cluster, which has an older version of Istio installed.

A while ago I decided to try Istio in my garage Kubernetes lab, and replaced ingress-nginx with istio-ingressgateway. At the time being I installed Istio 1.9.4, the latest release is already 1.11.4. To avoid being left in the deprecated zone I planned to upgrade my Istio installation.

I chose the canary upgrade path because of

  • The in-place upgrade can be disruptive while canary upgrade keeps both versions running during the upgrade
  • The in-place upgrade doesn’t encourage skipping a version, on the other hand the canary upgrade can be done from 1.9 to 1.11

The first step is to update my istioctl command to the latest version. Then following the official Istio upgrade document I did a pre-flight check with:

istioctl version
client version: 1.11.4
control plane version: 1.9.4
data plane version: 1.9.4 (27 proxies)

istioctl x precheck
✔ No issues found when checking the cluster. Istio is safe to install or upgrade!
  To get started, check out https://istio.io/latest/docs/setup/getting-started/

The next step is to install the latest version with a revision. The revision name can be anything as its only purpose is to distinguish from the existing version. In my case I just used canary but it could be 1-11-4 which is more meaningful.

istioctl install --set revision=canary

Now check if the new control plane is running. The istiod-canary is obviously the one I need to pay attention to.

# below is just a copy from the official doc as I didn't save my command output
kubectl get pods -n istio-system -l app=istiod
NAME                                    READY   STATUS    RESTARTS   AGE
istiod-786779888b-p9s5n                 1/1     Running   0          114m
istiod-canary-6956db645c-vwhsk          1/1     Running   0          1m

It can be done namespace by namespace to test the new Istio data plane. Since I have ArgoCD in my cluster, I can simply change a namespace’s label in gitops and let ArgoCD apply the changes automatically. But to get the pods injected with new Istio sidecars I still need to restart the deployment with:

kubectl rollout restart deployment httpbin -n httpbin

Then I used the following command to check the result:

istioctl proxy-status | grep httpbin
httpbin-6d779b74f7-74pfs.httpbin                         SYNCED     SYNCED     SYNCED     SYNCED     istiod-canary-945cbbf49-mnwtn     1.11.4

After I updated all namespaces and restarted all deployments in those namespaces, I nervously uninstalled the old Istio version. The old version was installed with default settings and its revision is default too.

istioctl x uninstall --revision default
  Removed HorizontalPodAutoscaler:istio-system:istiod.
  Removed PodDisruptionBudget:istio-system:istiod.
  Removed Deployment:istio-system:istiod.
  Removed Service:istio-system:istiod.
  Removed ConfigMap:istio-system:istio.
  Removed ConfigMap:istio-system:istio-sidecar-injector.
  Removed Pod:istio-system:istiod-7d5fdcc6c-jqrf9.
  Removed EnvoyFilter:istio-system:metadata-exchange-1.8.
  Removed EnvoyFilter:istio-system:metadata-exchange-1.9.
  Removed EnvoyFilter:istio-system:stats-filter-1.8.
  Removed EnvoyFilter:istio-system:stats-filter-1.9.
  Removed EnvoyFilter:istio-system:tcp-metadata-exchange-1.8.
  Removed EnvoyFilter:istio-system:tcp-metadata-exchange-1.9.
  Removed EnvoyFilter:istio-system:tcp-stats-filter-1.8.
  Removed EnvoyFilter:istio-system:tcp-stats-filter-1.9.
  Removed MutatingWebhookConfiguration::istio-sidecar-injector.
✔ Uninstall complete   

And my blog is still online, so, success!

EDIT: 20 Mar 2022. I’ve upgraded Istio again to 1.13.2 using the same procedure. Since the revision canary was the existing one I only needed to name the new revision something else, ie.

istioctl install --revision=magpie

And in case there’s some validation error like this when updating a gateway

error: gateways.networking.istio.io "mygateway" could not be patched: Internal error occurred: failed calling webhook "validation.istio.io": Post "https://istiod-canary.istio-system.svc:443/validate?timeout=30s": service "istiod-canary" not found

This is caused by a stale Validating Webhook Configuration resource:

k edit validatingwebhookconfigurations.admissionregistration.k8s.io istiod-istio-system
# and replay istiod-canary with the new revision, ie. istiod-magpie

🙂