ArgoCD, Jsonnet and Tanka

Ever since I’ve installed ArgoCD in my garage Kubernetes lab, I wanted to make Tanka work with ArgoCD, so that I can do GitOps with Jsonnet, in addition to YAML, kustomize and helm charts.

I was hugely inspired by(read: copied and pasted from) this blog post. Here are the steps I made Tanka worked as a plugin of ArgoCD.

In my previous post, I created my own sideloader container image to load Tanka(tk) and Jsonnet bundler(jb) into the ArgoCD repo server. This is a better approach in my opinion because it hides all the beautiful bash commands needed to side load binaries in ArgoCD’s official custom tooling instructions.

With the tk and jb binaries ready, the next step is to configure Tanka as a plugin of ArgoCD. A ConfigMap( named argocd-cm by default ) hold a lot of server settings for ArgoCD. A full YAML example for argocd-cm can be found here, but for this task only a plugin definition is needed.

data:
  configManagementPlugins: |
    - name: tanka
      init:
        # with my sideloader, binaries are in /sideloader directory
        command: ["/sideloader/jb"]
        args: ["install"]
      generate:
        # `sh -c` is necessary to substitude the ENVs in the args 
        command: ["/bin/sh", "-c"]
        args: ["/sideloader/tk show environments/${TK_ENV} --dangerous-allow-redirect"]

That last step is to check-in an ArgoCD application which points to where the Tanka repository is. Note the plugin section below.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: httpbin
  namespace: argocd
  finalizers:
  - resources-finalizer.argocd.argoproj.io
spec:
  destination:
    namespace: default
    server: https://kubernetes.default.svc
  project: default
  source:
    path: httpbin
    repoURL: https://github.com/raynix/argo-gitops.git
    targetRevision: HEAD
    # to invoke the tanka plugin
    plugin:
      name: tanka
      env:
        - name: TK_ENV
          value: default
  syncPolicy:
    automated:
      prune: true

The Tanka and Jsonnet files I did for httpbin is here. The result is very satisfying.

Good luck with your experiment 🙂

PS. I tried to use istio-libsonnet to generate Istio Gateway and VirtualService objects but at the moment it was broken so I wrote my own simple helper for those Istio resources.

Sideloader: An InitContainer to Sideload Stuff to Your Main Container

After having played with ArgoCD for a few days, I came cross a blog post on how to get Grafana Tanka to work with ArgoCD. I like the idea to have Tanka as a plugin of ArgoCD, because:

  • The main ArgoCD docker image doesn’t get bloated by all those binaries we want to use with ArgoCD
  • Also I don’t need to wait for a ArgoCD release to use newer plugins

But eventually I need tk( the CLI file of Tanka ) in the ArgoCD’s runtime container so it’s made available to ArgoCD applications. There are 2 ways to get tk into ArgoCD’s docker image: the docker way and the kubernetes way.

The Docker Way

It’s quite straight-forward to build a new docker image based on an upstream one and add stuff to the new one.

FROM quay.io/argoproj/argocd:v2.1.2
# downloading tk and jb binaries and mark them executable
RUN curl -sL -o /tools/tk https://... && \
    curl -sl -o /tools/jb https://... && \
    chmod +x /tools/*

This works but sorta defeats the purpose to have tk as a plugin, ie. the container image will have to be rebuilt when either ArgoCD or tk has a new release.

The Kubernetes Way

ArgoCD has instructions to load additional tools via volumeMounts already. But the shell commands are all over the place in the yaml. I built a tiny(8.3MB) sideloader docker image to get the job done in a DRYer fashion.

Here’s how to use the sideloader to add tk and jb binaries to the argocd-repo-server container:

# this is the argocd-repo-server deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: argocd-repo-server
...
spec:
  template:
     spec:
     # same shared volume
       volumes:
         - name: sideloader
           emptyDir: {}
     # let the original repo-server container use the shared volume
       containers:
         - name: argocd-repo-server
           volumeMounts:
             - name: sideloader
               mountPath: /sideloader
     # use the sideloader as initContainer to load stuff
       initContainers:
         - name: sideloader
           image: ghcr.io/raynix/sideloader:latest
           args:
           # args are processed in pairs
             - tk
             - https://github.com/grafana/tanka/releases/download/v0.17.3/tk-linux-amd64
             - jb
             - https://github.com/jsonnet-bundler/jsonnet-bundler/releases/download/v0.4.0/jb-linux-amd64
           volumeMounts:
             - name: sideloader
               mountPath: /sideloader
     

After the new pods are running, I can verify that tk and jb are downloaded into the argocd-repo-server container as expected:

[email protected]:~$ ls -lht /sideloader/
total 18M
-rwxr-xr-x 1 _apt ssh 7.5M Sep 28 13:38 jb
-rwxr-xr-x 1 _apt ssh 9.8M Sep 28 13:38 tk

The user _apt and group ssh were actually curl_user and curl_group set by the curl container which sideloader based on. Not perfect but this won’t block anything.

🙂

Run ArgoCD with Istio Service Mesh in a Kubernetes Cluster

It’s been quite a while since I installed Flux CD V2 in my garage Kubernetes lab, as there’s a lot of debate going on between Flux and ArgoCD I decided to give ArgoCD a go. The other reason to try ArgoCD is that it supports Jsonnet.

By default installation, ArgoCD will use self-signed TLS certificate and enforce TLS connection, which means users get to see the security warning and have to trust the certificate to continue. Naturally with Istio handles ingress and TLS termination, I would like to enable Istio sidecar for ArgoCD and run it in HTTP mode.

Here are the steps to configure and install ArgoCD along side with Istio:

Enalbe Istio Sidecar

I choose to enable automatic Istio sidecar injection for ArgoCD’s namespace.

# create the namespace, by default it's argocd
kubectl create namespace argocd
# turn on istio injection
kubectl label namespace argocd istio-injection=enabled

Install ArgoCD the Normal Way

kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

Disable TLS for argocd-server Deployment

This can be done before or after the deployment being applied to the cluster in the above step, eg. edit the install.yaml before the apply command or use kubectl edit deployment command afterwards. It may probably be easier if using Helm for this tweak.

# kubectl edit deployment argocd-server
# and add --insecure argument
...
      containers:
      - command:
        - argocd-server
      - args:
        - --insecure
...
# then save and exit. A new pod with --insecure will start and replace the old one

Sample Gateway Schema

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: argocd-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
    - hosts:
        - argo.example.com
      port:
        name: https
        number: 443
        protocol: HTTPS
      tls:
        mode: SIMPLE
        # argo-cert is a tls secret in istio-system namespace, containing a valid TLS cert for the domain name argo.example.com
        credentialName: argo-cert
    - hosts:
        - argo.example.com
      port:
        name: http
        number: 80
        protocol: HTTP
      tls:
        httpsRedirect: true

I use cert-manager and let’s encrypt to provision free TLS certificates for my personal projects. For more info please see this.

Sample VirtualService Schema

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: argocd
spec:
  gateways:
    - argocd-gateway
  hosts:
    - argo.example.com
  http:
    - route:
      - destination:
          host: argocd-server
          port:
            number: 80

If the DNS is already working and pointing to the Istio ingress gateway, I can see ArgoCD in my browser with a valid TLS certificate.

🙂

Update GCP IAM Adaptively with Terraform DataSources

In a scenario where a service account in a central GCP project needs to be accessible by a group of GKE service accounts across multiple GCP projects, the IAM part in Terraform HCL could look like

resource "google_service_account" "service_account" {
  account_id   = "sa-${var.environment}"
  display_name = "Test Service Account"
  project      = var.project_id
}

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    "serviceAccount:xxx.svc.id.goog[k8s-namespace/k8s-sa]",
    "serviceAccount:yyy.svc.id.goog[k8s-namespace/k8s-sa]",
    ...
  ]
}

I can make a variable for the members so it becomes

variable "project_ids" {
  type = list(string)
}

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    for project_id in var.project_ids: "serviceAccount:${project_id}.svc.id.goog[k8s-namespace/k8s-sa]"
  ]
}

But still the project_ids variable needs to be populated in a tfvars file with hard-coded project IDs. Is there a more flexible way to do this, so that I don’t need to add or remove a project ID from the list when projects come and go?

With google_projects data source, I can list and filter project IDs based on a filter string, however I couldn’t find a filter for the condition that the project has a GKE cluster with Workload Identity turned on, such as

# this does NOT work! Just my good wish
data "google_projects" "cas_projects" {
  filter = "gke_workload_identity: true"
}

Then the last hope is external data source as always. I use the google_projects data source to get filtered project IDs first, then use a bash script as the external data source to filter GCP projects which has GKE and Workload Identity enabled.

First, the google_projects data source filtering with GCP folder IDs

variable "gcp_folder_ids" {
  type = list(string)
}

data "google_projects" "gcp_projects" {
  filter = join(" OR ", [ for folder_id in var.gcp_folder_ids: "parent.id: ${folder_id}"])
}

The the external data source picks up the project IDs and further filter those with the bash script.

data "external" "gcp_projects_with_wli" {
  program = ["bash", "${path.module}/scripts/project-ids-with-wli-enabled.sh"]

  query = {
    project_ids = join(",", [ for proj in data.google_projects.gcp_projects.projects: proj.project_id ])
  }
}

The bash script requires gcloud and jq to run, also it needs to impersonate a service account which has permission to list and query all GCP projects under an organization.

#!/bin/bash
# this is scripts/project-ids-with-wli-enabled.sh
# set -e
if [[ -z "${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}" ]]; then
  export CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE=$HOME/.config/gcloud/application_default_credentials.json
else
  gcloud config set auth/impersonate_service_account "${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}"
fi

function filter_gcp_project() {
  for project_id in $(jq -rc '.project_ids'| tr ',' ' '); do
    pool_id=$(
      gcloud container clusters list --project $project_id --format json \
        | jq -r .[0].workloadIdentityConfig.workloadPool
    )
    [[ $pool_id == ${project_id}.svc.id.goog ]] && echo $project_id
  done
}

declare -a VERIFIED_PROJECT_IDS=()
VERIFIED_PROJECT_IDS+=( $(filter_gcp_project) )
jq -rn '{ "verified_project_ids": $ARGS.positional|@csv }' --args ${VERIFIED_PROJECT_IDS[*]} |sed 's|\\\"||g'
# sample output
# { "verified_project_ids": "projectid1,projectid2" }

Unfortunately external data source only support a string as input and output, so all the project IDs have to be joined into a string as input and then get split to form an array, etc.

Finally the updated IAM binding block using the external data source, with a lot of string manipulations 🙂

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    for proj_id in split(",", data.external.gcp_projects_with_wli.result.verified_project_ids) : "serviceAccount:${proj_id}.svc.id.goog[cert-manager/ksa-google-cas-issuer]"
  ]
}