Article number
000003324
Affected Versions
6.5
7.0
Source Hypervisor
All
Target Hypervisor
Azure

Problem with After an Azure ZCA’s VRA Service Restarts, All VPGs will not Start Necessary Sync and thus do not Replicate

Viewed 300 times

Summary

An administrator may notice that all VPGs replicating to an Azure ZCA are suddenly seeing rising RPO measurements. The administrator may also find the RPO is roughly as long as the last time the ZCA/ZCA's VRA services were rebooted or perhaps crashed.

Root Cause

Per Microsoft:

"It is related to the issue where sometimes the response received from the server is only partial content, due to network failures. When this SDK receives the partial response, which is an XML format content, it does not check the closings of the XML tags, skipping the partial content that should have triggered exception for customer to retry."

In short, Zerto is utilizing the Azure SDK to list all journal items in the Azure container. The Azure SDK only gets a partial response and thus returns a partial response and passes this along to the ZCA. Since not all of the journal files are found, the ZCA keeps trying over and over (hence loops of the same VRA log errors). Therefore, a sync can never truly begin and thus replication is down.

Symptoms

Azure ZCA VRA service is restarted/crashes or the ZCA is rebooted/crashes.

Once the ZCA/VRA service stabilizes, all VPGs replicating to this ZCA should start either a bitmap or delta sync as expected. However, no sync ever kicks off and the RPO of the VPGs will continuously rise.

No connectivity alerts are seen as the ZCA will be connected with any peer ZVMs and VRAs without issue.

VRA logs will show the following errors in a looping fashion:

2020-01-29 18:49:19.652,Error,8084,VRA,Mirror,init, m_mirId=5921807573304044400 10080 meta items failed in recovery:
.
2020-01-29 18:49:19.652,Error,8084,VRA,Mirror,init, m_mirId=5921807573304044400 failed:SFS Item: type 0x4 [COMPRESSED IO]:[269000000368000]<487021144,512,compBytes=177380>
.
2020-01-29 18:49:19.652,Error,8084,VRA,Mirror,init, m_mirId=5921807573304044400 failed:SFS Item: type 0x0 [UNCOMPRESSED IO]:[2690000003694d6]<493798904,512>

Solution

The only workaround is to recreate the VPG. Kindly follow the Preseeding Volumes KB to recreate the VPG with preseed disks to avoid initial sync.


This issue was permanently fixed in Zerto version 7.5.