Test Failover VMs Become Unusable After Several Hours of Testing Failover

KB Number:

The intended purpose of a Failover Test is to ensure that all VMs within a VPG are recovered as expected to the defined checkpoint. This solution describes why the Failover Test VMs may become unresponsive after an extended period of time.


When a VPG performs a Failover Test, assuming the test network configured for the VPG is different from the production network, there is no effect on the production machines or their protection.

Zerto Virtual Replication manages the data changes for the Failover Test VMs at the recovery site and stores them in the journal, instead of writing to the replica volumes themselves. As such, Zerto Virtual Replication can remove the unneeded temporary data from the Failover Test VMs following the completion of the Failover Test

Considering the above, when the Failover Test is run for an extended period of time, the IO changes for the Failover Test VMs can fill the empty space in the journal.  When this happens, the Failover Test VMs will become unresponsive as there is no space left in the journal in which to write the Failover Test IO changes.  Additionally, you will see alerts related to the affected VPG with regard to the journal being filled.

Note: Zerto's best practice is to limit the amount of time a VPG is in a Failover Test operation to only the required time. Activities to confirm the proper consistency of the Failover Test VPG, such as ensuring that the OS boots properly, and that application works as expected, should be performed in a timely manner.

Affected Versions:
Up to and including version 2.0 Update 3 (Xeme)

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)