Should you consider vCHS-DR or a Zerto-powered Cloud Service Provider for DRaaS?
By Zerto, on 5 August, 2014
By Will Lin, Zerto Cloud Solutions Engineer
If you have heard of Zerto’s award-winning Zerto Virtual Replication (ZVR), you know that ZVR is used in hundreds of enterprises as the preferred disaster recovery (DR) & virtual replication solution. Enterprises use ZVR to easily replicate virtual machines from their production data center to an alternate DR data center, achieving recovery point objectives (RPO) of seconds, and recovery time objectives (RTO) of minutes. ZVR is completely storage agnostic and can work with different vSphere versions (include legacy vSphere 4.x).
What you may not have known, is that there are currently over 130 cloud service providers (CSPs) around the world, who are providing DR as a Service (DRaaS) powered by Zerto. These CSPs (which include well-known names like Colt, Terremark, Kelway, Veristor, Peak10, iLand and Bluelock) are using ZVR to replicate hosted workloads between their own IaaS cloud data centers, as well as providing a DR cloud target for their enterprise customers.
Recently, VMWare announced the availability of a DR offering in their multi-tenant public cloud, vCloud Hybrid Service (vCHS-DR). vCHS-DR allows customers running VMWare vSphere to make live replicas of their running VMs to one of VMWare’s vCHS data centers.
If you are an organization that’s exploring the possibility of using an external cloud service provider as your DR target, should you consider vCHS-DR or one of the Zerto-powered CSPs?
Well, judge for yourself. Besides pricing (hint: Zerto-powered CSPs are extremely price-competitive and vCHS has been noted to be pricey as you can read here), there are some significant technical differences between vCHS-DR and a Zerto-powered CSP. I’ll just list 3 obvious ones here:
DR Automation and Orchestration:
- vCHS-DR uses vSphere Replication (VR) without Site Recovery Manager (SRM) to deliver its DR service. In fact, vCHS-DR isn’t compatible with SRM. Moreover, vCHS-DR uses a version of vSphere Replication that is incompatible with the production version of VR. So any recovery plans you may already have will need to be completely redone. The vCHS-DR VR is just a replication mechanism that creates a VM replica at the target site. Without SRM, there’s no automation or orchestration to automate and coordinate your actual DR failover. With VR, you have to individually configure each VM for replication, each with its own schedule and RPO. RPO is unpredictable due to the snapshot approach vCHS-DR uses. The minimum advertised RPO is 15 minutes, which doesn’t match what you will normally see in actual production performance. It will usually be much higher unless snapshots are run every 15 minutes, which will cause considerable slow-down in the production datacenter. Also, there is no ability to execute a non-disruptive DR test without actually failing over the VM. Without automation, there is no guarantee to achieve a consistent RTO. This requires several labor-intensive hours to recover just a few VMs. Even with the expected new features , industry experts agree that vCHS doesn’t meet the requirements for most organizations.
- ZVR is both a per-VM replication mechanism, as well as a full-featured DR automation/orchestration solution. RPO is measured in seconds and RTO is usually measured in minutes, even with heavy I/O workloads. With ZVR, you can pre-configure protection groups, protect multiple VMs together in a consistency group, re-IP the failed over workloads, customized the boot order for failed over VMs, and execute failback. You can also execute non-disruptive failover tests any time you want, without interrupting production workloads and without break in replication & RPO. With ZVR, you can recover an entire site within minutes just by pushing the red button.
- Currently there are six vCHS datacenters, with expected expansion to 10.
- There are over 130 active cloud service providers offering Cloud DR powered by Zerto Virtual Replication. This encompasses global coverage across hundreds of datacenters.
- All Zerto-powered cloud service providers offer production-level support and SLAs.
Multiple Point-In-Time Recovery:
- Even though VR 5.5 has the ability to recover to multiple point-in-time instances, this ability is disabled when using vCHS-DR. This is probably because VR needs to use VMWare snapshots to create these multiple PIT recovery points. If you’re a VMWare administrator and have seen/used multiple, chained VM snapshots, you know it’s something that no one (including vCHS admins) wants to do!
- ZVR is a CDP-like replication solution, with the ability to failover to any point-in-time within the journal. The journal is configurable from 1 hour to 5 days. Recovering to a particular point-in-time with ZVR is literally as simple as operating a Tivo/DVR.
- You cannot failback with vCHS-DR! Of course, that’s not what vCHS marketing material will tell you. Imagine that you are a vCHS-DR customer, and you actually suffered a disaster and failed over your workloads over to vCHS. In order to get the workloads back to your original production site, you may actually incur a much longer outage then the original disaster itself!
This is how vCHS-DR “failback” works:
- Power off all production VMs at the vCHS cloud side. Outage starts here. End-users will not have access to the production application service during this outage.
- Go to the original source site and manually rename or delete the original VMs from vCenter inventory.
- Manually copy the VMs from vCHS back to the original source site using vCloud Connector (basically a vCD export and then a vSphere import). This is essentially a FULL COPY! [Note: Have you ever try to copy large VMDKs across the WAN? Both processes are incredibly time-consuming.]
- Wait for the copy over the WAN.
- Once the VMs are copied back to the original source site, manually edit the VM network settings to connect to original source site port groups.
- Manually power on VMs at the original source site. Outage now ends.
- Manually reconfigure and restart replication back to cloud.
- All of the above steps are manual. Remember, there is no automation of any of these steps with vCHS-DR.
- With ZVR, reverse replication and failing back a protection group literally takes a few mouse clicks and is fully automated. If the VMDKs from the protected VMs are still intact at the original source site, ZVR will intelligently use those VMDKs as pre-seed targets. This means that when ZVR starts the reverse replication, only those changes to those VMDKs are transferred, not the entire data set of the VMDKs. This dramatically reduces the replication time and WAN bandwidth utilization. Once the two sites are in sync, the actual failback process will only take minutes. Done and done.
If you are considering DRaaS, I encourage you to check out one of the Zerto-powered cloud service provider partners in order to enjoy industry’s most robust, full-featured, disaster recovery & replication solution.