4 Ways to Prepare for the Next Apocalyptic Amazon Web Services Outage
Remember February? Data feeds of hotel prices, user activity, photographs, favorited sites and product offers delivered a spinning wheel of death.
You probably heard about the outage in late February that hit Amazon Web Services (AWS), the company’s cloud-computing service. Amazon is the country’s largest service provider of cloud computing for businesses, and thousands of websites and apps that rely on its services found themselves in freak-out mode due to the interruption.
What happened was this: Amazon’s Simple Storage Service (S3) went down; and, instantly, Amazon clients couldn’t access all kinds of things they’d stored remotely on the company’s servers -- everything from images to customer transaction data.
The cause of the outage was eventually identified as human error, and the chaos was compounded by the fact that other AWS services rely on S3 for storage. And the danger is not over: In spite of Amazon's being known for its very reliable track record, just such an outage will inevitably happen again.
That's why small businesses must learn from past mistakes and be ready for the next outage that comes along.
The scars remain
Because so many businesses rely on S3 -- more than 148,000 websites, according to TechCrunch -- the outage had far-reaching ramifications, from small inconveniences to the complete erosion of business models and revenue drivers.
Popular image- and link-sharing site Pinterest uses AWS, which meant that during the outage, everyone from soon-to-be brides keeping track of their favorite wedding decorations, to 18-year-olds recording their burgeoning tattoo ideas,was unable to build their libraries. While this doesn’t seem too terribly consequential, it’s evidence of how an outage can directly affect the lives of millions of users.
Any company that, say, chose to build its entire infrastructure on Amazon undoubtedly experienced a little bit of panic when its logins ceased to work. Data feeds of hotel prices, user activity, photographs, favorited sites and product offers delivered only a spinning wheel of death for those who attempted to load them. Few were spared -- not Business Insider or Giphy, or countless IoT-connected thermostats and lightbulbs.
Hope for the future
Luckily, Amazon was able to get its systems back on track in relatively short order. But its previous image as an immovable, unbreakable pillar in the hosting community no doubt suffered. Moreover, the incident got the attention of quite a few developers and helped them realize that having a backup plan is probably a decent strategy.
If you’re one of these developers or the leader of a small business, pay attention to the tips below while formulating your plan.
1. Divide the duties.
Make sure your hosted scripts aren’t solely reliant on one service. Handing your entire operation over to an outside service -- such as Microsoft or Amazon -- will cut you off from the necessary controls that allow you to address threats or outages when they occur.
If your budget allows, set up more than one hosting service in case the primary source falls through. Simple checks can ensure that if the original source isn’t available, a fallback to another host -- or even a locally hosted source -- can keep your site running.
2. Rack up redundancies.
If your entire project really is reliant on AWS capabilities, the alternative is to build up a secondary source -- hosted elsewhere -- to operate in a redundant fashion. This way, when one service goes down, the product can keep moving along, with little to no interruption.
Of course, the difficulty of this is monumental. While Microsoft Azure or Google offers comparatively robust packages for scaling, the way these operate is often significantly different than AWS does. Many of the items that come prebuilt with AWS simply won’t be available on Google or other services. This means you’ll either need to invest in extensive custom development to create similar functionality or set a preplanned goal to limit your system’s reliance on AWS-specific capabilities.
3. Fancify your failures.
Even with the strongest precautions in place, the occasional outage is inevitable. In this event, make sure your failure points are graceful. When a user gets an ugly AWS message that’s practically indecipherable, it creates a feeling of panic and anxiety. Instead, plan for these types of situations with useful, friendly and self-effacing messages that actually give the user an idea of what’s happening.
For example, if my website at Rocksauce Studios goes down, users aren’t faced with a vomit of incomprehensible code. Instead, a friendly message pops up saying, “Oops!” before offering users the ability to either try again or contact our support team for help. Not only does this show our awareness of the problem, but it also offers an avenue for users to reach out for a fix.
4. Give the gift of gab.
Be ready to communicate with your users. This is perhaps the most important factor because having a completely independent back-up system -- whether you’re a company such as Netflix or a small bootstrapping startup -- is an extreme solution, given the presumably rare nature of the problem. Instead, having a well-formed response to send to users via email, Twitter, Facebook or other means can provide you with the necessary grace.
Responding promptly to customer inquiries, even when you don’t have the answer, goes a long way toward showing that you care about your customers and are aware of the problem.
Related: Does Your Website Have a Crash Plan?
The reliability of cloud hosting services is measured in “uptime,” or the amount of time these services are working and available. Most companies do everything possible to keep their uptime as close to 100 percent as they can, meaning they're not likely to experience regular outages any time soon. That being said, taking a few simple precautionary measures can go a long way toward easing the pains of an outage should one happen to hit your hosting services in the future.