Ripple Effects Felt Across the Internet With AWS Outage

Problems within the Amazon Web Services infrastructure caused large chunks of the Internet to either load slowly or not load at all starting 12:00 ET/15:30 GMT on Dec. 7, according to data from real-time outage monitoring service DownDetector. Amazon said the problems were in the US-EAST-1 region, which refers to Amazon’s data centers in Virginia, and impacted Elastic Compute Cloud (EC2), Connect, DynamoDB, Glue, Athena, Timestream, Chime and other AWS Services hosted in that region.

Network monitoring company ThousandEyes posted updates throughout the day. The screenshot from the ThousandEyes console shows that the API endpoint using the AWS API Gateway began to time-out after 10 seconds. “Corresponding with the HTTP timeouts, we see greatly increased transaction times of between 20-30 seconds, as well as transaction timeouts,” the company noted. 

“We also saw widespread impact to Amazon’s EC2 service across multiple regions, including in the U.S., Europe, and APJC, although the user impact varied depending on user IP address. Amazon’s S3 service also appeared to be impacted. Both of these services are dependencies for many non-Amazon apps and services, so collateral impacts may be broad,” ThousandEyes said.

“The root cause of this issue is an impairment of several network devices,” Amazon said in an update, and noted that recovery is being impeded by the fact that the outage impacted Amazon’s own monitoring and incident response tools. AWS customers may be unable to login using root login credentials, the company said in an update, and recommended “using IAM Users or Roles for authentication.”