Azure Policy as Code: Governance with Terraform and Bicep

min read

A policy assigned at the wrong scope is benign. A policy with a typo in the condition silently fails to enforce anything. A DeployIfNotExists (DINE) policy without the right managed identity permissions creates remediation tasks that queue forever without executing. Azure Policy is the most capable governance tool in Azure — and the one most likely to produce a false sense of security when misconfigured.

Organizations often assign a policy and assume the environment is compliant. However, without moving from Audit to Deny and automating remediation, gaps remain open. This guide covers the full lifecycle of Policy-as-Code.

By the end, you will:

  • Implement the three-resource model: Definitions, Initiatives, and Assignments.
  • Write custom policy definitions in JSON and deploy them via Terraform and Bicep.
  • Assign regulatory benchmarks (CIS, NIST, MCSB) at the management group level.
  • Configure DINE policies to automate resource configuration.
  • Manage exceptions through time-bound exemptions.

This is Post 4 in the Azure Platform Engineering series.


The Three-Resource Model

Azure Policy separates its governance model into three distinct resources:

  1. Policy Definition: The rule itself — a JSON document specifying the if condition and then effect.
  2. Policy Initiative: A group of definitions (e.g., the CIS Benchmark). Use these to simplify management at scale.
  3. Policy Assignment: The application of a definition or initiative to a scope (Management Group or Subscription). This triggers enforcement.

Exclusions vs. Exemptions

  • Exclusions: structural carve-outs (e.g., excluding the Sandbox MG from production Deny rules). No evaluation occurs.
  • Exemptions: Precise, time-bound waivers for specific resources. The resource shows as “Exempt” in compliance reports.
  graph TD
    subgraph Governance_Resources [Policy-as-Code Objects]
        Def[Policy Definition: 'Enforce Tags']
        Init[Policy Initiative: 'CIS Benchmark v2.0']
        Def --> Init
    end

    subgraph Scope_Hierarchy [Azure Management Scopes]
        MG_Root[Tenant Root]
        MG_LZ[Landing Zones MG]
        Sub_A[Workload Sub A]
        Sub_B[Workload Sub B]
    end

    Init -- Assignment --> MG_LZ
    Def -- Assignment --> Sub_A

    MG_LZ --> Sub_A
    MG_LZ --> Sub_B

Notes:

  • Definitions are the individual rules; Initiatives aggregate them for easier management.
  • Assignments link the policy to a scope (Management Group, Subscription, or Resource Group).
  • Inheritance ensures that a policy assigned at a high level (e.g., Landing Zones MG) covers all current and future resources beneath it.

Policy Effects and Enforcement

Audit and AuditIfNotExists

Use Audit to measure a gap before enforcing it. The standard progression is: assign as Audit, remediate existing resources, then convert to Deny.

Deny

Blocks non-compliant resource creation. ARM rejects the request before it is written, providing a RequestDisallowedByPolicy error.

DeployIfNotExists (DINE)

The most powerful effect. If a related resource (like a diagnostic setting) is missing, the policy’s managed identity deploys it automatically via an ARM template.

DINE Example: Key Vault Diagnostic Settings

{
  "if": { "field": "type", "equals": "Microsoft.KeyVault/vaults" },
  "then": {
    "effect": "DeployIfNotExists",
    "details": {
      "type": "Microsoft.Insights/diagnosticSettings",
      "roleDefinitionIds": ["/providers/Microsoft.Authorization/roleDefinitions/92aaf0da-9dab-42b6-94a3-d43ce8d16293"], // Log Analytics Contributor
      "deployment": {
        "properties": {
          "mode": "incremental",
          "template": {
            "resources": [
              {
                "type": "Microsoft.KeyVault/vaults/providers/diagnosticSettings",
                "apiVersion": "2021-05-01-preview",
                // Extension resources use 'parentName/providers/resourceName'
                "name": "[concat(parameters('kvName'), '/microsoft.insights/platform-diag-settings')]",
                "properties": { "workspaceId": "[parameters('workspaceId')]" }
              }
            ]
          }
        }
      }
    }
  }
}

Regulatory Compliance Benchmarks

Assign these at the Landing Zones management group to establish a baseline:

InitiativeID (built-in)
Microsoft Cloud Security Benchmark (MCSB)1f3afdf9-d0c9-4c3d-847f-89da613e70a8
CIS Azure Foundations Benchmark v2.0.006f19060-9e68-4070-92ca-f15cc126059e
NIST SP 800-53 Rev 51f9634a6-3fde-4bd2-82f3-30b5702d8d90

Deploying via Terraform

Use jsonencode() or file() to manage your JSON definitions.

resource "azurerm_management_group_policy_assignment" "cis_benchmark" {
  name                 = "cis-foundations-lz"
  display_name         = "CIS Benchmark v2.0.0"
  policy_definition_id = "/providers/Microsoft.Authorization/policySetDefinitions/06f19060-9e68-4070-92ca-f15cc126059e"
  management_group_id  = var.landing_zones_mg_id
}

resource "azurerm_role_assignment" "dine_remediation" {
  scope                            = var.landing_zones_mg_id
  role_definition_name             = "Log Analytics Contributor"
  principal_id                     = azurerm_management_group_policy_assignment.kv_diags.identity[0].principal_id
  skip_service_principal_aad_check = true // Prevents replication race conditions
}

Best Practices

  • Parameterize the Effect: Always parameterize the effect field in your definitions. This allows the same rule to be Audit in Sandbox and Deny in Prod.
  • Mode: All vs Indexed: Use mode: 'All' for networking policies. Indexed skips subnets and other child resources.
  • Time-bound Exemptions: Never create an exemption without an expiry date.
  • Wait for Evaluation: Allow 15–30 minutes for the policy engine to evaluate resources before triggering remediation tasks.

Limits to track (2026):

  • 500 policy definitions per scope.
  • 200 policy assignments per scope.
  • 1,000 exemptions per scope.

Troubleshooting

“DINE remediation task is stuck at ‘Running’” Check the deployment history in the Policy blade. This usually indicates the managed identity lacks RBAC on the target subscription.

“Terraform replace on every plan” Terraform compares JSON strings literally. Use jq --sort-keys in a pre-commit hook to normalize your policy JSON files.


Sources

Next, proceed to Post 5: Subscription Vending. We will integrate these policies into the automated workload onboarding process.