Amazon Web Services (AWS) has disclosed that a severe outage on October 20, 2025, which paralyzed thousands of apps and websites globally, was triggered by an automated system bug. The malfunction originated within AWS’s automated DNS management system for its Amazon DynamoDB service in the US-East-1 data-centre region and cascaded into widespread disruption.

AWS said the issue began when an empty DNS record was created in the US-East-1 zone and the automation platform responsible for detecting and correcting such anomalies failed to act. Because of that single faulty record, the company turned off the automation tools and required manual intervention to restore service. The automation system stopped executing and AWS had to disable the system globally while engineers worked on a fix.

The fallout was vast. Apps and platforms including Snapchat, Signal, Fortnite and numerous financial, gaming and smart-device services reported outages. Even internet-connected mattresses from a smart-bed company failed to operate correctly as their cloud connectivity went dark, causing users to wake up to malfunctioning beds.

When the issue began in the US-East-1 region, AWS noted that an underlying “health-monitoring subsystem” associated with its internal network load balancers was the trigger, but the wider root cause boiled down to automation code. Because the DynamoDB database system is foundational to many AWS services, the single region’s routing error caused a domino effect across services, deploying more pressure on the cloud ecosystem.

The incident once again raises questions about how centralized a single cloud provider’s infrastructure has become and how dependent many businesses are on these few platforms. Experts pointed out that outages at such a major cloud provider can ripple through nearly every part of the digital economy—banking systems, mobile services, IoT devices, streaming, commerce and industrial controls.

AWS has promised further improvements, including strengthening the automation safeguards, increasing testing of the DNS and database control planes, and offering more guidance to customers on building resilient architectures that do not rely solely on one region or automation system.