Amazon Web Services Outage and Cloud Disaster Recovery

By Zerto, on 1 July, 2012

Looks like quite a few people will be skipping their July 4 vacation to review and rebuild their disaster recovery (DR) plans. By now, even if you’re not a customer of Netflix, Instagram or Pinterest you’ve certainly heard of the power outage at an Amazon Web Services (AWS) data center in northern Virginia on Friday night, caused by an electrical storm in the Washington, D.C. area, where over a million residents found themselves without power and many more people worldwide found themselves without their favorite hobbies and movies.

The facebook post below includes a letter from Amazon to their customers, and includes some explanation for what happened.  RDS is a distributed database where data is replicated over several machines – the goal is to make the database more durable and much faster since you can query any number of machines instead of one machine. In this situation, “inconsistency issues” means that when they got the distributed database back up, inconsistent data made the data impossible to restore. RDS is not a typical database but it should certainly survive an electricity outage!

 

Cloud is Not Magic

A really interesting discussion of a older AWS outage was done by Focus, in their Cloud Roundtable series.  The panelists discuss blame, SLAs and one major cloud challenge, expressed by George Reese as, “the marketing messages and confusion in the marketplace.”  He continued, “People think cloud is the place you don’t have to worry about stuff.  Cloud is not magic.”   A similar sentiment was expressed in this Mashable article about the AWS outage: “The outage to Instagram and other major sites shows that — despite massive hype and momentum in the Internet world — cloud computing isn’t necessarily a magic solution for businesses’ data and IT needs.”

Will this AWS outage, and the other highly publicized outages lead to decreased adoption of cloud computing?  Some of the experts say yes.  Others say no, but they concede that more companies will be pushed towards private cloud.

“Does the need for better disaster recovery (DR) destroy the cloud value proposition?” All of the roundtable participants agreed, “no”. Smarter disaster recovery solutions need to be in place to insure that outages, and they are inevitable, are mere hiccups and don’t disable services for days.  What kind of DR services are you getting from your cloud service provider?  Ask these four questions to make sure your DR solution is the industry’s best.  When you get DR right, cloud can seem pretty magical.

One comment on “Amazon Web Services Outage and Cloud Disaster Recovery

  1. Reply

    While we’re all excited about cloud, I think this type of discussion is one that will inevitably come up when folks are considering ‘internal vs external’ replication options. Challenges for recovery are still there, for crash consistent multi-tiered application recovery with write order fidelity, etc. Zerto can provide a ‘smarter’ solution for folks.

    There’s a major hole in recoverability options from these providers. My last company used Amazon for their SaaS offering, it went down for 5000+ customers twice last year alone, support can’t handle that many tickets and cancellations happen due to these outages.

Leave a Reply

Your email address will not be published. Required fields are marked *


*