How to Troubleshoot Stuck Initial Syncs
Viewed 1351 times
An administrator may experience a new VPG get stuck in its Initial Sync. This article explains how to perform basic troubleshooting of such an issue.
There are three main reasons for Initial sync to become stuck:
Production VM powered off.
Network issues between the production and recovery VRAs.
Insufficient resources to maintain replication of the I/O load of the protected application.
The VPG is in a continuous Initial Sync that never completes.
VPG's RPO continuously rises.
No checkpoints are created, leaving the VPG unable to recover.
If Bandwidth or time-based Throttling are enabled, the throttling values might not be sufficient for the environment to perform replication and cause delay in the Initial sync. Disabling the throttling feature may assist the sync to be completed.
Navigate to the site settings and check rather or not you have Bandwidth Throttling enabled (for more information on Bandwidth regulation click here). If it is enabled, check rather or or not the value meets the minimum required bandwidth (for more information see how to identify the minimum required bandwidth).
If the VRA's resources are congested, the sync may not progress. For more information, please see how to validate VRA resources. If you identified that additional resources are required you can follow the KB Adding resources in Vmware.
To check if the network resources are causing the bottlenecks:
Use iPerf tool to verify the bitrate available between the 2 sites. For more information on how to run iPerf please see - How to check bandwidth using iperf.
Compare the results by:
Use the bandwidth output to verify if the bandwidth observed matches the configuration.
If the values match, login to the ZVM UI and go to the dashboard page. There you can find the WAN graph which you can compare to the results of iPerf.
By default, Zerto will start throttling I/Os whenever the storage latency goes above 40 ms. This configuration can be viewed in the Site Settings menu -> Throttling -> Show Advanced Settings.
Do not to change this configuration without consulting with Zerto support.
Make sure that the storage latency does not go above the configured limit.
To properly replicate, the Vras must be able to communicate over ports 4007 and 4008 in a bidirectional fashion.
Please review the following KB for some network troubleshooting steps:
VRA Network troubleshooting using Plink
If the issue persists, contact Zerto support and include the following information:
Reference the KB
Attach screenshots of the results of all of the steps that were taken (including Iperf, VRA resources, Screenshots, Etc.)
Name of the the affected VPGs and their VRAs
The following logs will be needed
Time frame - 8 Hours
Relevant hosts logs and hypervisor logs.
If VCD is being used, VCD logs will be required as well.
Collect the logs only after you’ve opened a case and have the case number.
** for more information please see How to collect Zerto logs**