Problem with VRA Service Crash on ZCA with Out of Azure VPGs Due to Tagging Policy Applying to Zerto Snapshots
Viewed 301 times
An administrator may experience an Azure ZCA's VRA service crash where there are "Out of" Azure VPGs being managed by said ZCA.
Prior to 7.5 Patch 1, where the replication out of Azure was done from non-managed disks, snapshots used by Zerto were read only, and tags could only be added upon creation. An additional validation was added to the VRAs to verify that only one tag could exist on each snapshot. Now that Zerto uses Managed Snapshots, the snapshots are no longer read only, and tags can be added.
The VRA will have to remove this validation in order for tagging of Managed Snapshots created by Zerto to be supported.
It is common to have in place a "tagging policy" in Azure for resource management (https://docs.microsoft.com/en-us/azure/governance/policy/tutorials/govern-tags). When this applies to the snapshots within the Zerto Snapshot Resource group in Azure, and there is at least one VPG replicating out of Azure, it will cause the VRA to crash. Zerto will have to be running version 7.5 or above for this to happen, due to the switch to using Managed Snapshots.
An Assert like this will be seen in the VRA logs continuously before the VRA restarts:
AzureDeleteTaggedSnapshot,update,AzureComputeBatchOperation<deleteTaggedSnapshot#2> updating operation Type=GetMetaData targetObjectName=/subscriptions/4f464852-fe8f-4715-8f6f-7e44358a5704/resourceGroups/Zerto_1e999e9a-b566-4e9b-afd2-57b51d452540_Snapshots/providers/Microsoft.Compute/snapshots/ZertoPrtVolId-197726930202972146_2020.04.12.19.16.58.0639 id=17692, status=0
Assert,6644,NON,AzureDeleteTaggedSnapshot,onPostMetaInfo, AzureComputeOperations.cpp 117: m_snapshot.tags.size() == 1 failed. received 2 snapshot MDs for /subscriptions/<subscription_GUID>/resourceGroups/<resource_group>/providers/Microsoft.Compute/snapshots/<snapshot_name> m_tag=197726930202972146
The only solution is to disable automated tagging or resources in Azure, or to exclude the Zerto Snapshot resource group(s). These can be identified by the naming convention zerto_siteID_snapshots. Technically, this issue can be reproduced by manually tagging snapshots as well, but this is a far less likely cause than an automated policy in Azure.
Disable the Azure or exclude the snapshot RG: (https://docs.microsoft.com/en-us/azure/governance/policy/overview#policy-assignment)
Once the automated tagging is disabled, delete any VPG replicating out of Azure (settings and disks can be saved for import).
Manually delete any snapshots left within the Zerto Snapshot Resource Group(s).
Re-import / Re-create deleted VPGs