• This topic has 4 replies, 3 voices, and was last updated December 22, 2021 by Robert M.

Issues post failover testing

  • We performed a DR test last week. During the test and afterwards we experienced a lot of issues that appear to be related to disk corruption (ie: corrupted files, volumes, databases, etc). We used the move option for testing as I wanted to make sure we didn’t lose any data. While moving out to our DR site, we had a few machines that went into Windows recovery on startup. I rolled back and tried again and was able to get them started the second time. We experienced many more issues moving back to our production site and I wasn’t able to easily remediate all of them. In a few instances, we started getting errors about missing or corrupted files on the VMs, services not able to start due to that, etc. I ended up having to recover VMs and files from storage snapshots and Veeam backups at our production site pre-move. Needless to say, this does not instill confidence in Zerto and/or that we would be able to recover in a matter of minutes as designed and promised when we purchased the product if we were to ever need to do a real failover due to a loss of our production site.

    Anyone experienced similar issues? I’m not quite sure where to start on troubleshooting and remediating these issues so they don’t happen again in the future. We are doing VMWare > VMWare replication and all of these systems are Windows Server (mostly 2012 R2 and 2019). Any input is welcome.

    Hi Bryan,

    Can you please share what version of Zerto you are running? This could be linked to a known issue. If not, its something we can investigate via a case if you still have logs.

     

    Regards,

     

    Bob

    Hi Bob,

    We are running 9.0 Update 1.

    Thanks!

    Hi Brian,

     

    The bug is on an older version of Zerto so this is not the cause. If you log a ticket we can give you some better answers.

    This occurred for me during a recent failover (test).  After trying four checkpoints, I was forced to perform a force sync.  The force sync resolved the issue, however, that not be an option in a disaster scenario.

    Support has suggested VSS agent as a remedy but reason given was quite vague.

    The blue screen is disappointing to see.  It would be great if Zerto would provide some general OS best practice KBs.  I’ll I can find is SQL BP.

    Tagged: 
You must be logged in to create new topics. Click here to login