Update GCP IAM Adaptively with Terraform DataSources

In a scenario where a service account in a central GCP project needs to be accessible by a group of GKE service accounts across multiple GCP projects, the IAM part in Terraform HCL could look like

resource "google_service_account" "service_account" {
  account_id   = "sa-${var.environment}"
  display_name = "Test Service Account"
  project      = var.project_id
}

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    "serviceAccount:xxx.svc.id.goog[k8s-namespace/k8s-sa]",
    "serviceAccount:yyy.svc.id.goog[k8s-namespace/k8s-sa]",
    ...
  ]
}

I can make a variable for the members so it becomes

variable "project_ids" {
  type = list(string)
}

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    for project_id in var.project_ids: "serviceAccount:${project_id}.svc.id.goog[k8s-namespace/k8s-sa]"
  ]
}

But still the project_ids variable needs to be populated in a tfvars file with hard-coded project IDs. Is there a more flexible way to do this, so that I don’t need to add or remove a project ID from the list when projects come and go?

With google_projects data source, I can list and filter project IDs based on a filter string, however I couldn’t find a filter for the condition that the project has a GKE cluster with Workload Identity turned on, such as

# this does NOT work! Just my good wish
data "google_projects" "cas_projects" {
  filter = "gke_workload_identity: true"
}

Then the last hope is external data source as always. I use the google_projects data source to get filtered project IDs first, then use a bash script as the external data source to filter GCP projects which has GKE and Workload Identity enabled.

First, the google_projects data source filtering with GCP folder IDs

variable "gcp_folder_ids" {
  type = list(string)
}

data "google_projects" "gcp_projects" {
  filter = join(" OR ", [ for folder_id in var.gcp_folder_ids: "parent.id: ${folder_id}"])
}

The the external data source picks up the project IDs and further filter those with the bash script.

data "external" "gcp_projects_with_wli" {
  program = ["bash", "${path.module}/scripts/project-ids-with-wli-enabled.sh"]

  query = {
    project_ids = join(",", [ for proj in data.google_projects.gcp_projects.projects: proj.project_id ])
  }
}

The bash script requires gcloud and jq to run, also it needs to impersonate a service account which has permission to list and query all GCP projects under an organization.

#!/bin/bash
# this is scripts/project-ids-with-wli-enabled.sh
# set -e
if [[ -z "${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}" ]]; then
  export CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE=$HOME/.config/gcloud/application_default_credentials.json
else
  gcloud config set auth/impersonate_service_account "${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}"
fi

function filter_gcp_project() {
  for project_id in $(jq -rc '.project_ids'| tr ',' ' '); do
    pool_id=$(
      gcloud container clusters list --project $project_id --format json \
        | jq -r .[0].workloadIdentityConfig.workloadPool
    )
    [[ $pool_id == ${project_id}.svc.id.goog ]] && echo $project_id
  done
}

declare -a VERIFIED_PROJECT_IDS=()
VERIFIED_PROJECT_IDS+=( $(filter_gcp_project) )
jq -rn '{ "verified_project_ids": $ARGS.positional|@csv }' --args ${VERIFIED_PROJECT_IDS[*]} |sed 's|\\\"||g'
# sample output
# { "verified_project_ids": "projectid1,projectid2" }

Unfortunately external data source only support a string as input and output, so all the project IDs have to be joined into a string as input and then get split to form an array, etc.

Finally the updated IAM binding block using the external data source, with a lot of string manipulations 🙂

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    for proj_id in split(",", data.external.gcp_projects_with_wli.result.verified_project_ids) : "serviceAccount:${proj_id}.svc.id.goog[cert-manager/ksa-google-cas-issuer]"
  ]
}

Grant a Service Account an IAM Role in AWS/GCP

How to grant a pod running in a Kubernetes cluster necessary permissions to access cloud resources such as S3 buckets? The most straight forward approach is to save some API key in the pod and use it to authenticate against cloud APIs. If the cluster is running inside the cloud, an IAM role can then be bound to a service account in the cluster, which is both convenient and safe.

I’ll compare the ways IAM role and service account bind in AWS/EKS and GCP/GKE.

AWS/EKS

The EKS is the managed Kubernetes service in AWS. To bind an EKS service account to an AWS IAM role:

  1. Create an IAM OIDC provider for your cluster
  2. Create an IAM role which can be assumed by the EKS service account
  3. Annotate the EKS service account to assume the IAM role

GCP/GKE

The GKE is the managed Kubernetes service in GCP. In GCP this is called Workload Identity(WLI), in a nut shell it binds a GKE service account to a GCP IAM service account, so it’s a bit different than the one above. The full instruction is here but in short:

  1. Enable WLI for the GKE cluster
  2. Create or update node-pool to enable WLI
  3. Create IAM service account and assign roles with necessary permissions
  4. Allow IAM service account to be impersonated by a GKE service account
  5. Annotate the GKE service account to impersonate the GCP service account

🙂