And so this is how a tiny Cloudflare update broke huge chunks of the internet

Daniel Sims · Nov 19, 2025

Ripple effect: When Cloudflare experienced a massive outage on Monday, many people, including the company's engineers, initially suspected a sophisticated DDoS attack. The company later explained that a flawed update to its server infrastructure caused a single file to malfunction. Several major outages in recent years have resulted from similar single points of failure.

Cloudflare CEO Matthew Prince has published a detailed apology and explanation of the incident, which disrupted many popular online platforms. Uber, ChatGPT, McDonald's, League of Legends, X, the New Jersey Transit system, and even TechSpot experienced service interruptions for hours.

Because Cloudflare protects these and other sites from DDoS attacks and other threats, the company first assumed it was facing a major security incident when servers began failing at around 6:20 ET on Monday morning. Another reason for the initial assumption was that the outages appeared and disappeared over about two hours before becoming continuous around 8:00.

However, the company eventually discovered that, when it changed a permission in a database system under a mistaken assumption about its behavior, it doubled the size of a file critical to Cloudflare's bot manager. This manager, which directs automated traffic through the company's systems, updates continuously in response to ever-evolving threats but also contains certain file size limits to minimize memory consumption and ensure smooth performance.

When the bot manager updated with the inflated file, which exceeded those limits, the result was an error. The glitches were initially intermittent due to the time needed for the faulty file to update throughout the entire system. Cloudflare resolved the issue by reverting to an earlier version of the file at 11:30 and had restored all operations by noon.

This chart shows the volume of 5xx errors served by the Cloudflare network. Normally this should be very low, but the peaks show when the outage first manifested and then fully unchained.

Prince described the incident as the company's worst since a major outage in 2019 and promised that Cloudflare would review the affected systems and return stronger. However, the event is only the latest example of a small mistake causing a major outage.

In October, a glitch in a single database server caused a major Amazon Web Services outage that took ChatGPT, Fortnite, Reddit, Amazon, and other popular services offline. One of the most serious incidents of this kind occurred last July, when a faulty CrowdStrike security update triggered the infamous Blue Screen of Death on critical Windows systems worldwide. The outage affected broadcasters, transportation services, and numerous other businesses.

Permalink to story:

And so this is how a tiny Cloudflare update broke huge chunks of the internet

Thatsdisgusting · Nov 19, 2025

Vibe updates with vibe code made you vibe from internet for a day

scoffer · Nov 19, 2025

And wasn't the much-vaunted Mossgrow 365 out as well? LOL.........Cloud garbage. No thanks.

Xelions · Nov 20, 2025

More like..

We apologize for our gross incompetence and negligence, and had we invested/retained the necessary staff with the expertise actually required our customers would not have experienced the bullsh#$ that occurred that was of our own doing. We'd also like to admit that the greed shared amongst our CEO, executives and high level employees, and the current way capatialism is - that we don't mind breaking our product/services to meet high returns for our shareholders. In other words, fu#$ you!

Saying it like it is!

desperado81 · Nov 20, 2025

So where's the redundancy? Shouldn't these big companies have some failover in case of failure, an emergency backup like for electricity? Should be required by law for large companies.

Plutoisaplanet · Nov 20, 2025

This guy definitely wasn’t to blame:

https://twitter.com/x/status/1990842843481387410

FaTaL · Nov 20, 2025

Training exercise right?

gamerk2 · Nov 21, 2025

desperado81 said:
So where's the redundancy? Shouldn't these big companies have some failover in case of failure, an emergency backup like for electricity? Should be required by law for large companies.

They kind of did; once they understood the problem they fixed it by reverting a previous change. All in all, a three hour downtime for a major configuration snafu is pretty short, all things considered.

And so this is how a tiny Cloudflare update broke huge chunks of the internet

Daniel Sims

Posts: 2,480 +76

Thatsdisgusting

Posts: 520 +598

scoffer

Posts: 556 +415

Xelions

Posts: 302 +261

desperado81

Posts: 6 +1

Plutoisaplanet

Posts: 1,913 +2,583

FaTaL

Posts: 643 +860

gamerk2

Posts: 1,610 +2,047

Similar threads

Latest posts