Solved: Uninstallation of config-connector Got Stuck in ArgoCD


The Kubernetes Config Connector is another level of IaC(Infrastructure as Code): It wraps Google Cloud resources like a Cloud Load Balancer with Kubernetes CRDs(Custom Resource Definition) so instead of writing Terraform HCL I can write YAML to manage GCP infrastructure.

However when there’s a need to uninstall a config-connector, it got stuck in ArgoCD

As always, kubectl describe is my friend in this kind of situations.

$ k describe ns config-connector-ops
Name:         config-connector-ops
Labels:       app.kubernetes.io/instance=config-connector
              cnrm.cloud.google.com/project-id=my-gcp-project-id
              kubernetes.io/metadata.name=config-connector-ops
              name=config-connector-ops
Annotations:  <none>
Status:       Terminating
Conditions:
  Type                                         Status  LastTransitionTime               Reason                Message
  ----                                         ------  ------------------               ------                -------
  NamespaceDeletionDiscoveryFailure            False   Tue, 11 Apr 2023 00:30:43 +1000  ResourcesDiscovered   All resources successfully discovered
  NamespaceDeletionGroupVersionParsingFailure  False   Thu, 06 Apr 2023 11:03:48 +1000  ParsedGroupVersions   All legacy kube types successfully parsed
  NamespaceDeletionContentFailure              False   Thu, 06 Apr 2023 11:03:48 +1000  ContentDeleted        All content successfully deleted, may be waiting on finalization
  NamespaceContentRemaining                    True    Thu, 06 Apr 2023 11:03:48 +1000  SomeResourcesRemain   Some resources are remaining: configconnectorcontexts.core.cnrm.cloud.google.com has 1 resource instances, rolebindings.rbac.authorization.k8s.io has 2 resource instances
  NamespaceFinalizersRemaining                 True    Thu, 06 Apr 2023 11:03:48 +1000  SomeFinalizersRemain  Some content in the namespace has finalizers remaining: configconnector.cnrm.cloud.google.com/finalizer in 3 resource instances

It’s a bit obvious already: a few resources in this namespace has configconnector.cnrm.cloud.google.com/finalizer finalizer and that’s why they are still pending termination. The first one to take a closer look at is the configconnectorcontext resource as shown in the screen shot.

$ k describe configconnectorcontexts.core.cnrm.cloud.google.com configconnectorcontext.core.cnrm.cloud.google.com
...
Status:
  Errors:
    error during reconciliation: error building deployment objects: error transforming namespaced components: error getting namespace id for namespace config-connector-ops: error creating configmap 'configconnector-operator-system/namespace-id': configmaps "namespace-id" is forbidden: unable to create new content in namespace configconnector-operator-system because it is being terminated

I guess this is caused by ArgoCD trying to delete all stuff all at once, so the context object couldn’t change a config-map because the namespace was also marked for deletion. I had to manually remove finalizers in these stuck resources by manually editing them:

k edit configconnectorcontexts.core.cnrm.cloud.google.com configconnectorcontext.core.cnrm.cloud.google.com
k edit configconnector configconnector.core.cnrm.cloud.google.com
k edit rolebindings.rbac.authorization.k8s.io cnrm-manager-ns-binding-config-connector-ops
k edit rolebindings.rbac.authorization.k8s.io cnrm-admin-binding-config-connector-ops

Then ArgoCD was glad to tell me that the config-connector app was successfully delete. I also did the following clean-up as ArgoCD missed these:

k delete validatingwebhookconfigurations.admissionregistration.k8s.io abandon-on-uninstall.cnrm.cloud.google.com
k delete validatingwebhookconfigurations.admissionregistration.k8s.io validating-webhook.cnrm.cloud.google.com
k delete mutatingwebhookconfigurations.admissionregistration.k8s.io mutating-webhook.cnrm.cloud.google.com
k delete clusterrolebindings.rbac.authorization.k8s.io cnrm-manager-cluster-binding-config-connector-ops

🙂