Why cloudflare down disrupted major sites Nov 18

Cloudflare said its network suffered a major outage on November 18, 2025, when a malformed Bot Management feature file caused software limits to be exceeded, disrupting traffic for millions of websites worldwide and affecting services like ChatGPT, Spotify and YouTube while engineers reverted the file to restore service. 'cloudflare down' was not an attack; it resulted from a permissions change that created an oversized feature list in bot controls, and the company said services were fully restored by 17:06 UTC after a rollback. Experts linked the episode to dependency risks among major network providers, noting worldwide impact.

What happened and why

Investigators traced the outage to a recent database permission change that allowed an unusually large feature file to be created inside Cloudflare's Bot Management system, which then exceeded internal software limits and caused widespread traffic disruption.

Cloudflare emphasized the root cause was operational change rather than a cyberattack, and engineers mitigated the issue by reverting the feature file; the company reported full restoration by 17:06 UTC on November 18.

Scale: impacted about one in five websites globally
Duration: roughly three hours for major effects
Cause: oversized Bot Management feature file after permission change

Service	Impact	Notes
ChatGPT	Unreachable	AI chatbot access dropped for many users
Spotify	Streaming interruptions	Playback and login errors reported
YouTube	Partial outage	Embedded videos and uploads affected
Bet365	Site offline	Gaming and betting services disrupted
League of Legends	Service lag	Matchmaking and web portals unusable

The postmortem ruled out external attack as the trigger immediately.

Scope and services affected

Because Cloudflare routes traffic and provides CDN, DNS, and security services for millions of domains, the outage had outsized reach: estimates showed roughly 20% of websites relied on Cloudflare routing at the time, and several high-traffic platforms including gaming networks, streaming services, and news sites experienced partial or total downtime.

Public monitoring tools and status pages saw spikes in errors; Downdetector listings surged and major customers like ChatGPT and Spotify posted intermittent failures, illustrating how a single infrastructure provider outage can cascade across ecosystems.

Content platforms: video and audio services suffered playback issues
Applications: API-driven apps reported timeouts and auth failures
E-commerce: checkout flows and DNS resolution errors impeded sales
Gaming: matchmaking and login services showed latency or disconnects
News and publishing: site availability and comment systems went down

Analysts warned that concentration of cloud services raises systemic risk; teams should design multi-provider fallbacks and resilient DNS strategies.

Timeline and response

Detection and triage began quickly after error spikes appeared in monitoring systems; engineers identified abnormal Bot Management behavior, traced it to a permissions change, and executed a rollback that progressively restored connectivity across regions.

Time	Event
13:00 UTC	Errors spike in global monitoring
13:20 UTC	Investigators link issue to Bot Management file
14:05 UTC	Rollback initiated for feature file
15:30 UTC	Partial recovery in some regions
17:06 UTC	Full service restoration announced

Cloudflare's status updates and postmortem emphasized there was no evidence of malicious traffic or DNS compromise, and that changes to database permissions inadvertently allowed an oversized feature file to propagate; internal safeguards were adjusted and additional validation checks deployed to prevent recurrence.

Rolled back offending file and monitored traffic recovery
Validated configuration and permission controls across clusters
Updated incident response runbooks and added automated checks

External partners were notified and recommended DNS cache resets to reduce impact.

Lessons and next steps

The incident prompted rapid reactions from SRE teams and platform owners who prioritized incident review, customer communication, and contingency planning, with many citing the Cloudflare postmortem as a reminder that even sophisticated providers can suffer operational failures.

Security and infrastructure leads recommended multi-DNS setups, redundant CDNs, health-checking, and automated failover. Developers were urged to build graceful degradation for API calls and to prepare clear communication templates for real-time service degradation.

Design multi-provider topology
Implement DNS TTL and cache management
Automate rollback and configuration audits

One analyst noted that the 'cloudflare down' event exposes systemic concentration risk and urged enterprises to run regular chaos tests targeting third-party dependencies and to budget for provider redundancy.

Boards and CTOs will likely press vendors for tighter safeguards, more transparent change controls, and proof of isolation testing; observability data and incident drills will become standard parts of vendor evaluation over the next year.

Services were restored and Cloudflare published a detailed postmortem that attributed the outage to an operational permission change which created an oversized Bot Management feature file, not to a cyberattack. Teams across the internet will focus on resilience measures, including multi-provider DNS and CDN strategies, stricter configuration governance, and routine chaos engineering to simulate third-party failures and validate rollback paths. For operators tracking cloudflare down risks, actionable steps include reducing blast radius, lowering DNS TTLs, automating failover, and documenting communication plans for customers and stakeholders. Expect vendor Q&A sessions and regulatory scrutiny in coming months soon.

What happened and why

Scope and services affected

Timeline and response

Lessons and next steps

Related Articles

How Cloudflare Works — A Simple Non-Technical Guide

5 Visualization Patterns to Explain Complex Data

No-Code Predictive Analytics Tools for Non-Data-Scientists

Inside a Data Analyst’s Day: Practical Guide to the Role

Intel AI Chief Joins OpenAI to Lead AGI Compute

90-Day Data Analytics Beginner Guide