In case you haven’t seen the news – or heard from friends and family complaining on Facebook – Delta passengers experienced a second day of flight delays and cancellations today due to a power outage in Atlanta yesterday.
Hundreds of flights were cancelled and problems were compounded by Delta’s inability to get their notification systems up and running – many passengers were not updated about their cancelled and delayed flights. According to yesterday’s Wall Street Journal on the Delta outage:
“An electric problem at its Atlanta headquarters occurred at 2:30 a.m. ET and the airline was forced to hold hundreds of departing planes on the ground starting at 5 a.m., according to Ed Bastian, the chief executive, who apologized to customers on a video…”
And the kicker,
“The technical problems likely will cost Delta millions of dollars in lost revenue and damage its hard-won reputation as the most reliable of the major U.S.-based international carriers, having canceled just a handful of flights in the most recent quarter.”
Massive revenue loss and damage to the brand in just two days. The question everyone is asking is – how is this possible? How can companies not have the basic business continuity and disaster recovery systems in place to handle a power outage? This seems like IT 101.
“The meltdown highlights the vulnerability in Delta’s computer system, and raises questions about whether a recent wave of four U.S. airline mergers that created four large carriers controlling 85% of domestic capacity has built companies too large and too reliant on IT systems that date from the 1990s. Delta merged with Northwest Airlines eight years ago.
These systems—which run everything from flight dispatching to crew scheduling, passenger check-in, airport-departure information displays, ticket sales and frequent-flier programs—gradually have been updated but are still vulnerable, IT experts said.”
And don’t get us wrong – we are not here to gloat. We have seen this complexity first hand from many of our customers. Legacy systems, lack of hardware interoperability, new software versions running alongside older versions. The bigger the company, the greater the complexity. Short of revamping everything, which would be prohibitively expensive, there are a few simple steps companies can take to make sure their BC/DR systems will work when needed.
Our take: companies need to stop thinking about disaster recovery as an insurance policy, and shift their thinking to always-available IT, something we refer to as ‘IT Resilience’. Organizations that embrace resilience, a more proactive approach to BC/DR, focus on continuous availability rather than recovery after the fact. Their confidence in the resilience of their infrastructure comes from the knowledge their business is “ready for anything” and that they can scale effectively, incorporate new technology easily, and above all, avoid disasters.