Intermediate Architecture

GitOps at Scale with Argo CD and Multi-Cluster Kubernetes

GitOps shifts the operational model for Kubernetes to a Git-centric approach. All desired state lives in version control, and a reconciliation loop continuously drives the cluster toward that state. Argo CD is the industry-standard tool implementing this loop at scale.

Why GitOps Changes Everything

Traditional deployment pipelines push changes to infrastructure. GitOps inverts this — the cluster pulls its desired state from Git. This shift has three structural implications:

  • Auditability — Every change is a Git commit. History is the deployment log.
  • Self-healing — Any manual drift is automatically reverted on the next sync.
  • Rollback — Rolling back is a git revert. No runbooks required.

Architectural Pro Tip

Separate your application manifests repository from your application code repository. This prevents accidental coupling between deployment state and source history.

Multi-Cluster Application Set

Argo CD’s ApplicationSet controller enables fleet-level management. A single ApplicationSet resource can generate individual Application objects for every cluster in your fleet:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: guestbook
spec:
  generators:
    - clusters: {}
  template:
    metadata:
      name: '{{name}}-guestbook'
    spec:
      project: default
      source:
        repoURL: https://github.com/org/gitops-repo
        targetRevision: HEAD
        path: apps/guestbook/overlays/{{name}}
      destination:
        server: '{{server}}'
        namespace: guestbook
      syncPolicy:
        automated:
          prune: true
          selfHeal: true

Progressive Delivery with Rollouts

Pair Argo CD with Argo Rollouts for progressive delivery. A canary rollout stages traffic across multiple steps before full promotion:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
spec:
  strategy:
    canary:
      steps:
        - setWeight: 10
        - pause: { duration: 5m }
        - setWeight: 40
        - pause: { duration: 10m }
        - setWeight: 100

Reality Check

Enabling automated sync with pruning on production clusters without a proper promotion gate will delete resources when branches are merged. Always gate production sync behind a manual approval step.

Drift Detection and Alerting

Argo CD’s health status surfaces drift in real time. Integrate with your alerting stack by exposing the argocd_app_sync_status Prometheus metric and alerting on OutOfSync states persisting beyond your SLA threshold.

Multicloud factor

Argo CD is not the only GitOps implementation, and the platform context determines which tool fits best.

Azure / AKS: Microsoft supports Flux v2 natively through the Azure GitOps extension for AKS and Arc-enabled Kubernetes. Flux is a CNCF graduated project and reduces operational overhead for organisations standardising on Azure-managed services — you do not run Argo CD as a separate operational concern. For teams that specifically want Argo CD’s UI, ApplicationSet fleet management, and Argo Rollouts integration, self-managed Argo CD on AKS is a well-trodden path. Both are production-grade choices; the decision is mainly about managed service preference versus ecosystem features.

OCI / OKE: OCI DevOps Service provides pipeline-driven deployment to OKE with rollback capability, but it is not a GitOps engine in the Argo CD sense — it does not implement continuous reconciliation against a declared Git state. For pure GitOps on OCI, self-managed Argo CD on OKE works as it does on any Kubernetes cluster. The operational overhead of running the Argo CD control plane is yours.

Flux vs Argo CD: Argo CD has a richer UI, ApplicationSet for fleet management, and native integration with Argo Rollouts. Flux is more modular — separate controllers for image automation, Helm releases, and kustomization — and has deeper Azure Arc integration. For multi-cluster fleets where fleet-wide rollout control matters, Argo CD’s ApplicationSet and cluster generator are generally the easier path. For Azure-centric estates that want managed tooling, Flux with the Azure GitOps extension reduces the operational surface.

Closing Checklist

  • Separate application manifests from application source code. Deployment state and development history should not share the same repository.
  • Use ApplicationSet for fleet management. Individual Application objects per cluster do not scale; ApplicationSet with a cluster generator does.
  • Gate production sync behind a manual approval step. Automated sync with pruning on production without a promotion gate will delete resources on branch merges.
  • Pair Argo CD with Argo Rollouts for progressive delivery. Canary and blue-green strategies belong in the delivery pipeline, not in ad-hoc deployment scripts.
  • Alert on argocd_app_sync_status showing OutOfSync states persisting beyond your SLA threshold. Drift that is not surfaced is drift that accumulates.
  • Document the promotion model explicitly: which branches correspond to which environments, what triggers a sync, and what a rollback looks like. GitOps makes rollback easy only when the promotion model is well-defined before an incident.
  • For Azure / AKS: evaluate the native Flux GitOps extension before adding self-managed Argo CD. For teams that need Argo CD’s UI and ApplicationSet ecosystem, self-managed Argo CD on AKS is straightforward.
  • For OCI / OKE: self-managed Argo CD works as on any Kubernetes cluster. OCI DevOps Service provides pipeline-driven deployment but is not a continuous reconciliation engine — it does not replace GitOps if continuous reconciliation is a requirement.

References