3 Step Failover Testing for Disaster Recovery
This post was contributed by Joshua Stenhouse, Zerto’s UK Sales Engineer.
In my role as a Sales Engineer at Zerto, I often get asked about testing disaster recovery. Testing is one of the most complex aspects of disaster recovery, and so it’s clear why so many companies I speak to test their DR solution once a year or less. With this post, I thought I’d detail the specifics of disaster recovery testing with Zerto, particularly relating to it’s simplicity and the logic behind it. Please note that in the modern data center with redundant power supplies, generators, controllers, and networking, a single server failure is a most unlikely event, and this takes server clusters off the table when you’re considering your failover plans.
With Zerto’s Virtual Replication performing failover and failback testing involves three steps:
1. Selecting the Virtual Protection Groups to test (VPGs).
2. Selecting the point in time to test from using the journal of changes.
3. Clicking ‘failover test’.
Zerto will then automatically perform the following actions:
1. Register the VMs in the recovery site with the name in the format of “vmname — failover test”.
2. Create a temporary scratch thin VMDK on a per VM basis in the target datastore which stores any changes made in the failover test VMs.
3. Connect the failover test VMs to the port group specified for testing (which is hopefully not routable to production!).
4. Boot the VMs allowing you to log into the console to check the data is consistent and the applications work.
5. Leave the protected VMs powered in production and continues replicating changes.
This is done with no break in the replication or impact in production meaning you can perform disaster recovery failover and failback testing in working hours, in minutes with just a few clicks. Once you have finished your failover testing you click to stop the failover test and Zerto asks for the result in testing the application and allows the addition of notes.
Zerto will then perform the following actions:
1. Remove the failover test VMs from the inventory.
2. Delete the scratch VMDKs removing any changes made as part of the failover test.
3. Keep an up to date copy of all the changes made in production during the failover testing so there is no re-sync period or interruption of the replication.
You can then easily build pdf reports detailing the outcome of the failover testing which you can customize with your own company logo.
This is all pretty cool and easy to do, but hopefully now you might be wondering what else you could use this for? After all, if you can bring a temporary copy of your VMs online from previous points in time (increments every few seconds in up to 5 days of changes in the journal) and give vSphere Web console access to any user you desire, then you don’t only have to use them for testing disaster recovery! Some great ideas are:
- Testing both operating system and application upgrades before applying them to production which is great for pesky change requests.
- Recovering files and folders using VMWare tools and PowerCLI, removing the need for network access to the testing recovery VM. I will be covering this in more depth in a separate blog post coming soon.
- Giving access to a copy of the VM to developers for short term testing.
- Performing database consistency checks.
- Running reports on databases without impacting production.
- Bringing a copy of up to date Active Directory services online in the failover test network for use with VMs which require Active Directory for a successful failover test.
- Dynamically building a complete training environment on the fly with the ability to use an up to date copy of data with no changes made in production and all the changes deleted when the training has finished.
These are only the ideas I can think of, but if anybody can think of their own then please feel free to share. If you’d like to try out the simplicity of testing disaster recovery in Zerto then click here to request a trial today.