While kubectl apply effectively manages a single cluster, it becomes inefficient and problematic for managing a large fleet of fifty or more. As organizations scale their Kubernetes footprint, manual cluster management quickly becomes a liability. Configuration drift, security gaps, and “snowflake” clusters make disaster recovery a nightmare.
This guide takes you beyond basic GitOps tutorials. We will dive into the architectural patterns required to manage dozens of clusters from a single pane of glass using ArgoCD. You will learn about the “App of Apps” pattern, how to use ApplicationSets for dynamic multi-cluster targeting, strategies for handling secrets securely, and how to design directory structures that scale.
Section 1: The GitOps Architecture
1.1 Why GitOps for Multi-Cluster?
The core value proposition of GitOps is drift detection—knowing exactly when production deviates from your desired state in Git. In a multi-cluster environment, this visibility is non-negotiable. Git commit logs become your immutable deployment audit trail, providing a clear history of who changed what and when.
Furthermore, GitOps simplifies disaster recovery. Instead of manually piecing together a cluster’s state from backup scripts and documentation, you can recreate an entire cluster simply by pointing ArgoCD to your Git repository. The system automatically reconciles the live environment to match the stored configuration.
1.2 The “App of Apps” Pattern
Managing applications individually is manual toil that doesn’t scale. The “App of Apps” pattern solves this by introducing a “Root Application” that manages a folder of other “Child Applications.” This enables a one-click bootstrap of a new cluster: you deploy the Root App, and it automatically deploys everything else. The Root App can even manage itself if the path points to its own manifest.
Directory Structure for “App of Apps”:
.
├── argocd/
│ ├── root-app.yaml # The Master App
│ └── applications/ # Child Apps definitions
│ ├── guestbook.yaml
│ ├── prometheus.yaml
│ └── sealed-secrets.yaml
└── manifests/
├── guestbook/ # Actual K8s manifests
└── ...
The Root App Manifest (argocd/root-app.yaml):
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/my-org/gitops-repo.git
targetRevision: HEAD
path: argocd/applications # Points to folder containing child apps
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true
Section 2: Scaling with ApplicationSets
2.1 Generators: Dynamic Targeting
Static manifests require you to copy-paste YAML for every new cluster, which is error-prone. ArgoCD ApplicationSets solve this with “Generators.” A Cluster Generator automatically discovers clusters registered in ArgoCD based on labels and deploys applications to them.
By using dynamic templating variables like {{name}} and {{server}}, you can define a single ApplicationSet that configures child applications dynamically for each target cluster.
ApplicationSet using a Cluster Generator:
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: guestbook
namespace: argocd
spec:
generators:
- clusters:
selector:
matchLabels:
env: prod # Only target clusters labeled 'env=prod'
template:
metadata:
name: '{{name}}-guestbook'
spec:
project: default
source:
repoURL: https://github.com/argoproj/argocd-example-apps.git
targetRevision: HEAD
path: guestbook
destination:
server: '{{server}}'
namespace: guestbook
2.2 Directory Structure Strategy
Your Git folder structure dictates your operational workflow. For scalability, avoid copy-pasting YAML. Instead, adopt the “Base/Overlay” pattern using Kustomize. Define your shared configurations in a base directory and environment-specific differences in overlays.
Recommended Kustomize Structure:
.
├── base/
│ ├── deployment.yaml
│ ├── service.yaml
│ └── kustomization.yaml
└── overlays/
├── dev/
│ ├── kustomization.yaml
│ ├── replicas-patch.yaml # Sets replicas: 1
│ └── map-dev.yaml
└── prod/
├── kustomization.yaml
├── replicas-patch.yaml # Sets replicas: 3
└── map-prod.yaml
Hands-On Example: Deploying to “Dev” and “Prod” Automatically
Scenario: We will create an ApplicationSet that automatically deploys a “Guestbook” app to any cluster labeled env=dev, while ignoring other clusters.
Implementation Steps:
Label Clusters: Use the ArgoCD CLI to add labels to your connected clusters.
# Syntax: argocd cluster add <CONTEXT> --label <KEY>=<VALUE> argocd cluster add my-dev-cluster --label env=dev argocd cluster add my-prod-cluster --label env=prodDefine ApplicationSet: Create the manifest using the
clusterDecisionResourcegenerator (see the YAML in Section 2.1).Commit & Sync: Push the ApplicationSet manifest to Git. Watch ArgoCD’s UI; it will automatically populate the applications for the
devcluster and ignore theprodcluster if your selector wasenv=dev.
Verification:
- Verify the app appears in the ArgoCD UI for the “Dev” cluster.
- Verify the app is missing for the “Prod” cluster (or pending, if you adjusted the selector).
Best Practices & Optimization
Do’s:
- Use Kustomize or Helm: Raw YAML duplicates too much code. Use templating to keep your repository DRY (Don’t Repeat Yourself).
- Pin Versions: Always use specific Git tags or Helm chart versions. Never use
latest, as it introduces unpredictable changes. - Small Pull Requests: Keep PRs small. Large, monolithic PRs block the entire pipeline if one check fails.
Don’ts:
- Store Secrets in Plaintext: This is GitOps rule #1. Use External Secrets Operator (ESO). While Sealed Secrets are Git-native, they are static and harder to rotate at scale. ESO fetches secrets dynamically from AWS Secrets Manager or Vault, making it the preferred choice for enterprise scale.
- Manual Interventions: If you
kubectl edita resource, ArgoCD will revert it. Accept this discipline; the Git repository is the source of truth.
Performance & Security:
- Tip: Tune the ArgoCD “Reconciliation Timeout” to reduce API server load on large fleets.
- Tip: Use ArgoCD RBAC to restrict who can sync apps to Production environments.
Troubleshooting Common Issues
Issue 1: “Application is OutOfSync but diff is empty”
Cause: This is usually caused by a mutating admission webhook (like Istio injecting sidecars or Linkerd) modifying the resource after deployment. ArgoCD sees the live state differs from Git, but the “diff” is just the injected fields.
Solution: Configure
ignoreDifferencesin the ArgoCD Application manifest.spec: ignoreDifferences: - group: apps kind: Deployment jsonPointers: - /spec/replicas # Ignored because HPA manages it - group: "" kind: Secret jsonPointers: - /data/token # Ignored because it's dynamic
Issue 2: “Secrets aren’t syncing”
- Cause: The External Secrets Operator might not have permission to read the cloud provider’s vault.
- Solution: Check the ServiceAccount IAM role bindings and trust policies for the ESO controller.
Key Takeaways:
- ApplicationSets are the secret weapon for multi-cluster management, automating the targeting of workloads.
- Structure Matters: Your Git folder structure dictates your operational workflow; use Kustomize bases and overlays.
- Secrets Management is the biggest hurdle; solve it early with a robust tool like External Secrets Operator.
Next Steps:
- Audit your current manifests: Are you copy-pasting YAML?
- Refactor one application to use Kustomize bases and overlays.
- Experiment with a local Kind cluster to test ApplicationSets.
