Cloudflare Sets September 15 Default Block on Mixed-Use AI Crawlers

Over 50% of AI crawler traffic is wasted re-fetching unchanged pages, and starting September 15, 2026, Cloudflare's default settings will block any crawler that mixes search, agent use, and training from accessing ad-supported pages—unless the site owner explicitly opts in.

Why Mixed-Use Crawlers Are Getting Cut Off

Cloudflare is drawing a hard line between legitimate search indexing and AI data harvesting. The company's new default applies to all new Cloudflare customers, new sites added by existing customers, and all existing free-tier customers. Site owners can adjust settings to allow specific crawlers, but the default is now opt-in for mixed-use bots.

CEO Matthew Prince called out the "world's largest search engine"—Google—for making it nearly impossible for publishers to stay discoverable without also feeding AI training pipelines. Prince claims Google's crawler has "2x more information" access than other AI companies because Google Extended requires a separate opt-out, and site owners fear removing Googlebot entirely would kill their search traffic. Google counters that its Google Extended bot lets publishers opt out of AI use without affecting Search inclusion, but Cloudflare argues the two are still tangled in practice.

Cloudflare's Pay Per Use Model and Partners

Alongside the default block, Cloudflare is expanding its Pay Per Crawl marketplace into a broader "Pay Per Use" system. Early partners are Ceramic.ai and You.com. When a publisher opts in, they get paid every time their content appears in Ceramic's AI search results or when You.com accesses premium content. Cloudflare says this model lets publishers monetize AI usage based on value generated, not just bytes fetched.

The timing isn't accidental. A recent Cloudflare report showed that for the first time, non-human traffic now exceeds human traffic online—a milestone not expected until 2027. Prince argues that fast defaults are necessary to build a sustainable ecosystem where publishers aren't giving away IP for free.

What This Means for Google's AI Dominance

Google has long argued that its crawling practices are transparent, but Cloudflare's policy puts pressure on the search giant to separate its search crawler from its AI crawler more cleanly. If Google wants to continue indexing ad-supported sites under Cloudflare's new defaults, it will need to ensure Googlebot is classified as a single-purpose search crawler—or convince publishers to explicitly opt their sites back in for mixed AI use.

Publishers who keep the default block can still let in clean crawlers like Googlebot (if configured correctly) or any bot that declares a single purpose. But the era of unrestricted AI data scraping is ending for the largest chunk of the web. Cloudflare's middleman role gives it leverage to enforce this separation at scale, and the partners Ceramic.ai and You.com prove that a paid-content-for-AI model is already running in production.

Publishers who keep their default blocks can still let in clean crawlers, but the open web's data supply to large models just got a priced tollbooth.

Source: Cloudflare's new policy pushes AI companies to pay for publishers' content
Domain: techcrunch.com

Cloudflare Sets September 15 Default Block on Mixed-Use AI Crawlers

Why Mixed-Use Crawlers Are Getting Cut Off

Cloudflare's Pay Per Use Model and Partners

What This Means for Google's AI Dominance

More in Technology Policy