- This topic has 3 replies, 3 voices, and was last updated October 27, 2016 by Justin N.
Storage performance and best practices
Jason RJuly 20, 2015 11:46:24 AM
I have been pondering for some time now on what would be best for Zerto and its storage requirements, especially in a multi-tenant DRaaS environment (such as we have). Would welcome members comments on their own experiences regarding this topic as I think it would be useful for many of us.
One thing I have considered, but as yet not put in to practice to test, is if a pure SSD journal would benefit the failover testing process (which is the more likely type of failover to be run compared to a live failover 🙂
This is driven by the assumption that when we do a failover test we not only are using the journal volume (with the source VMDK volume) to build the machine we also then use the journal volume to store the temporary writes made. I can see this very quickly overwhelming a traditional disk-based shared journal volume and then impacting live replication (and potentially multiple end-users if you share the journal volume over multiple tenants).
Again thoughts on this (hint. anyone from Zerto) would be greatly appreciated.
JasonJustin NJuly 23, 2015 12:58:28 PM
The testing recover VMs are able to read from the recovery volumes, journal, and scratch journal – it can also write to the scratch journal too. I always tell folks to plan for comparable storage performance for recovery volume/journal locations and actually elaborate on these details in the MOD video titled “Storage Considerations for Continuous Replication and Recovery Operations”
Hope these details help!
— JustinDaniel JSeptember 29, 2016 04:56:43 PM
Those videos are great Justin! Thanks for making them.
I did not fully understand the age-out process prior to watching them.
So, if I understand this correctly, once the journal history has been met (which it should be for all protected VMs in a relatively short amount of time), each write on the protected side equals 2 writes and 1 read (1 write to the “front” of the journal, then 1 read from the “back” of the journal, followed by 1 write to the recovery disk) on the recovery side.
So my question is, Is there a best practice for setting up a tiered storage solution? Pin the journal to flash / SAS? Let the array figure out what’s hot? Put it on different storage than other workloads, and migrate after failover in a disaster?
We run some workloads off of our DR array, and our Zerto replication is killing it in term of adding latency to the other VMs.
I’m not looking for a magic bullet or an official Zerto answer, just some advice would be welcome.
Thanks!Justin NOctober 27, 2016 04:25:21 PM
Thank you for the warm regards on the videos, I am glad they were helpful!
The easiest way to view the journal is as a FIFO (First In First Out) queue of stored write changes over time. So for example, if you have a 4 hour journal and the VPG has been protecting for 4 hours, then on the first checkpoint that was created is read from the journal and written to the corresponding recovery volume(s) for that VM. This way, we make space in the journal for the latest replicated data to be received and written into the journal. This flow continues where the first/oldest data in the journal is read and applied from the journal into the recovery volume(s) for that VM so space is made for the latest replicated data.
As for recovery site storage performance, our recommendation is to plan for comparable storage performance usage at the production site since the near real time replication will follow similar IO patterns. Zerto’s default behavior is to locate the journal on the same datastore that the VM is set to recover to (IE in the VPG wizard on this screen: http://s3.amazonaws.com/zertodownload_docs/Latest/Zerto%20Virtual%20Replication%20Zerto%20Virtual%20Manager%20%28ZVM%29%20-%20vSphere%20Online%20Help/index.html#page/ScreenReference%2FAdvancedVMReplicationSettings.html%23). You can also have Zerto change the datastore by changing the value on this screen under the journal column either before saving the VPG (in which case the software will create the journal on that DS) or by editing the VPG (in which case the software will storage vmotion the journal for you). The same process can be followed to move recovery volume locations as well, both for initial create or to have Zerto svmotion the recovery volume for you.
Zerto does actively monitor storage performance as well to ensure that it is not negatively impacted. These advanced settings are located at this screen however should not be adjusted without the assistance of our support team: http://s3.amazonaws.com/zertodownload_docs/Latest/Zerto%20Virtual%20Replication%20Zerto%20Virtual%20Manager%20%28ZVM%29%20-%20vSphere%20Online%20Help/ScreenReference/AdvancedSettings.html#ww1183945 . I would probably recommend opening up a case if you believe that Zerto is causing any storage latency at the recovery site so our team could verify.
Note: The links above links will expire as new online help documentation is served from our support portal. Here are the details on their locations.
Link 1: Online help –> Screen References –> The Zerto Virtual Manager User Interface –> Advanced VM Replication Settings Dialog
Link 2: Online help –> Screen References –> The Zerto Virtual Manager User Interface –> Site Settings –> Performance and Throttling Dialog