Amazon S3 Outage Highlights Resilience Issues with Cloud Infrastructure

Amazon S3 suffered a significant outage on Wednesday in its US-East-1 region. This outage affected a number of companies in what seemed to be unpredictable ways.  Yesterday a DNS outage at GoDaddy caused similar effects on availability of what otherwise seems like an unrelated set of Internet sites.  We saw similar outages last year as a result of configuration problems at Level 3 and DDoS attacks from the Mirai botnet.  All of these outages point to significant resilience issues incurred with cloud and managed hosting services.  These resilience issues should be approached as part of risk management planning, but as our recent study in Ashburn VA highlighted, shared vocabulary for these types of informed risk decisions between customers and data center and network providers is often not adequate.

