Update GCP IAM Adaptively with Terraform DataSources


In a scenario where a service account in a central GCP project needs to be accessible by a group of GKE service accounts across multiple GCP projects, the IAM part in Terraform HCL could look like

resource "google_service_account" "service_account" {
  account_id   = "sa-${var.environment}"
  display_name = "Test Service Account"
  project      = var.project_id
}

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    "serviceAccount:xxx.svc.id.goog[k8s-namespace/k8s-sa]",
    "serviceAccount:yyy.svc.id.goog[k8s-namespace/k8s-sa]",
    ...
  ]
}

I can make a variable for the members so it becomes

variable "project_ids" {
  type = list(string)
}

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    for project_id in var.project_ids: "serviceAccount:${project_id}.svc.id.goog[k8s-namespace/k8s-sa]"
  ]
}

But still the project_ids variable needs to be populated in a tfvars file with hard-coded project IDs. Is there a more flexible way to do this, so that I don’t need to add or remove a project ID from the list when projects come and go?

With google_projects data source, I can list and filter project IDs based on a filter string, however I couldn’t find a filter for the condition that the project has a GKE cluster with Workload Identity turned on, such as

# this does NOT work! Just my good wish
data "google_projects" "cas_projects" {
  filter = "gke_workload_identity: true"
}

Then the last hope is external data source as always. I use the google_projects data source to get filtered project IDs first, then use a bash script as the external data source to filter GCP projects which has GKE and Workload Identity enabled.

First, the google_projects data source filtering with GCP folder IDs

variable "gcp_folder_ids" {
  type = list(string)
}

data "google_projects" "gcp_projects" {
  filter = join(" OR ", [ for folder_id in var.gcp_folder_ids: "parent.id: ${folder_id}"])
}

The the external data source picks up the project IDs and further filter those with the bash script.

data "external" "gcp_projects_with_wli" {
  program = ["bash", "${path.module}/scripts/project-ids-with-wli-enabled.sh"]

  query = {
    project_ids = join(",", [ for proj in data.google_projects.gcp_projects.projects: proj.project_id ])
  }
}

The bash script requires gcloud and jq to run, also it needs to impersonate a service account which has permission to list and query all GCP projects under an organization.

#!/bin/bash
# this is scripts/project-ids-with-wli-enabled.sh
# set -e
if [[ -z "${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}" ]]; then
  export CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE=$HOME/.config/gcloud/application_default_credentials.json
else
  gcloud config set auth/impersonate_service_account "${GOOGLE_IMPERSONATE_SERVICE_ACCOUNT}"
fi

function filter_gcp_project() {
  for project_id in $(jq -rc '.project_ids'| tr ',' ' '); do
    pool_id=$(
      gcloud container clusters list --project $project_id --format json \
        | jq -r .[0].workloadIdentityConfig.workloadPool
    )
    [[ $pool_id == ${project_id}.svc.id.goog ]] && echo $project_id
  done
}

declare -a VERIFIED_PROJECT_IDS=()
VERIFIED_PROJECT_IDS+=( $(filter_gcp_project) )
jq -rn '{ "verified_project_ids": $ARGS.positional|@csv }' --args ${VERIFIED_PROJECT_IDS[*]} |sed 's|\\\"||g'
# sample output
# { "verified_project_ids": "projectid1,projectid2" }

Unfortunately external data source only support a string as input and output, so all the project IDs have to be joined into a string as input and then get split to form an array, etc.

Finally the updated IAM binding block using the external data source, with a lot of string manipulations 🙂

resource "google_service_account_iam_binding" "service_account_workload_identity_binding" {
  service_account_id = google_service_account.service_account.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    for proj_id in split(",", data.external.gcp_projects_with_wli.result.verified_project_ids) : "serviceAccount:${proj_id}.svc.id.goog[cert-manager/ksa-google-cas-issuer]"
  ]
}