Anthony Attwood

Punny Stuff

Azure Key Vault Access Policies with ARM

2020-08-26 Anthony Attwoodazure

Azure Resource Manager templates let you define access policies for your Key Vault instances, but there’s some undocumented behaviour around how the access policies are applied. I recently had a situation where I needed to use ARM to allow an App Service managed identity to access vault secrets but I didn’t want to cause any existing access policies on the vault to be removed.

You can find the Azure Key Vault ARM resource documentation here.

In normal use, the access policies you define in the accessPolicies element will completely replace any other access policies that might exist on the vault. I did not want this behaviour. Basically I had to answer this question:

How do I define Key Vault access policies in an ARM template so that the access policies are added if they don’t already exist (or updated if they do already exist) but don’t cause any other policies that might be on the Key Vault to be removed?

This is possible, but a bit tricky, and relies (ab)using the createMode flag.

TL-DR; when the vault doesn’t already exist use createMode = create, all other times use createMode = recover

But first we have to detour a bit into the Key Vault lifecycle and the concept of soft delete and purge protection.

Key Vault lifecycle

See the docs here.

Key Vaults have a lifecycle; they’re more than just provisioned (existing) or deleted (not-existing). When you delete a Key Vault instance that has soft-delete enabled, the vault goes into a soft-delete state, from which it can be recovered before the soft-delete retention period expires.

Here’s one I’ve deleted. It’s in a soft-deleted state and can be recovered. Notice the -InRemovedState switch parameter.

$> Get-AzKeyVault -VaultName TonesKV1 -Location eastus -InRemovedState

Vault Name           : TonesKV1
Location             : eastus
Id                   : /subscriptions/********-****-****-****-************/providers/Microsoft.KeyVault/locations/eastus/deletedVaults/TonesKV1
Resource ID          : /subscriptions/********-****-****-****-************/resourceGroups/TonesKVBlog/providers/Microsoft.KeyVault/vaults/TonesKV1
Deletion Date        : 26/08/2020 2:02:18 AM
Scheduled Purge Date : 2/09/2020 2:02:18 AM
Tags                 :

From here I could recover the vault (Restore-AzKeyVault or az keyvault recover), or purge it to permanently delete it (Remove-AzKeyVault -InRemovedState or az keyvault purge).

Purge Protection is another layer of delete-protection that, if it’s enabled, prevents the vault from being purged. The docs describe how these features work together.

But back to our problem with the access policies.

(Re)Deploying a Key Vault without removing existing access policies

This is where the createMode in flag comes in (see the ARM docs, ctrl+f createMode).

createMode takes one of two values;

  • create - the default value, used to create a new vault (or update an existing one)
  • recover - used to recover a vault in a soft-deleted state, but can also be used on an existing active (non-soft-deleted) vault instance

But they also have different impacts on how accessPolicies are applied to the vault.

  • createMode = create

    • Key Vault does not exist

      • Creates the vault with the desired state.
      • Access policies are applied as defined in the ARM template.
    • KV exists and is active

      • Updates the vault to the desired state.
      • Access policies defined in the ARM template are added (or updated).
      • Access policies not defined in the ARM template are removed.
    • KV is soft-deleted

      • Error
  • createMode = recover

    • Key Vault does not exist

      • Error
    • KV exists and is active

      • Updates the vault to the desired state.
      • Access policies defined in the ARM template are added (or updated).
      • Access policies not defined in the ARM template remain unchanged.
    • KV is soft-deleted

      • Vault is recovered (no longer soft-deleted) and updated to desired state.

From the options above, we can see that there is no single setting that does what we want. If we always used create then the access policies would get replaced every time we redeployed the ARM template, but if we always used recover, then the vault provisioning would fail if the vault doesn’t exist.

Are we any closer?

We can still get what we want: if the vault does not exist, we use createMode = create, but if the vault exists then we want createMode = recover.

So we should be able to do something like;

***snip the rest of the template***
"variables": {
  "doesVaultExist": "[someFunctionToFindOutIfTheVaultExists()]"
},
"resources": [
  {
    "name": "MyVault",
    "type": "Microsoft.KeyVault/vaults",
    "apiVersion": "2016-10-01",
    "location": "[resourceGroup().location]",
    "properties": {
      "tenantId": "[resourceGroup().subscription.tenantId]",
      "createMode": "[if(variables('doesVaultExist'),'recover', 'create')]",
      "accessPolicies": [],
      "sku": { "name": "standard", "family": "A" }
    }
  }
]
***snip the rest of the template***

But this doesn’t work.

We’re on the right track, but unfortunately ARM doesn’t provide any way to ask if a resource exists. All the ‘sane’ ways you might think of will result in an error. There’s a feature request on the Azure Feedback forum (formerly UserVoice) to add it (here). Yes - there are …unpleasant… ways to do it with nested templates or querying tags that would have been set by a previous deployment or some such, but they’re often not viable options.

So how do we do it?

Many ARM template deployments will involve some step before and/or after where you need to run some script and use Az CLI or Az PowerShell to do something you can’t do with ARM. This is the magic sauce.

Change the above ARM template snippet to require you to pass in whether or not the key vault should be expected to exist. Instead of trying to derive the true/false value inside the template, you pass it in as a parameter.

***snip the rest of the template***
"parameters": {
  "KeyVaultExists": {
    "type": "bool"
  }
},
"resources": [
  {
    "name": "MyVault",
    "type": "Microsoft.KeyVault/vaults",
    "apiVersion": "2016-10-01",
    "location": "[resourceGroup().location]",
    "properties": {
      "tenantId": "[resourceGroup().subscription.tenantId]",
      "createMode": "[if(parameters('KeyVaultExists'),'recover', 'create')]",
      "accessPolicies": [],
      "sku": { "name": "standard", "family": "A" }
    }
  }
]
***snip the rest of the template***

You then use Az CLI or Az PowerShell to see if the vault exists before you do the ARM deployment. In PowerShell, it’d look something like this;

$exists = (Get-AzKeyVault -VaultName $vaultName) -ne $null
New-AzResourceGroupDeployment -ResourceGroupName $rgName -TemplateFile $file -KeyVaultExists $exists

But what if…

This method still has downsides.

The biggest among them is the need to know the name upfront. If your template implements a complicated or unpredictable naming scheme where you can’t know the vault name upfront, then you’re out of luck. In my experience though, you can generally know or predict the name of the key vault so this isn’t such a big problem.

It also requires you to run a script of some sort before you do the ARM deployment. This also isn’t often a big deal. You might already have a script where you can add the one-liner to query the existence of the vault. Or if you do your ARM deployment from a CD pipeline (Azure DevOps pipelines, GitHub Actions, etc), then it’s easy to add an extra step to query the existence of the vault and set a pipeline variable.