Cloudflare tests "pay-per-crawl" system to charge AI firms for scraping website content

midian182

Posts: 10,811   +142
Staff member
What just happened? Cloudflare is experimenting with a new way to prevent AI crawlers from scraping website content. The CDN/security company has announced that it will block them from accessing content without permission or compensation by default. Publishers can allow the crawlers, but the bots' AI firms will be charged.

Starting from today, every new website that signs up to Cloudflare will be asked if they want to allow AI crawlers to scrape their site. Site owners can not only choose if they want to allow access and to which content, but also decide how AI companies can use it.

Moreover, the AI companies can clearly state if the crawlers are being used for training, inference, or search, helping owners decide which crawlers to allow.

Must read: The Zero Click Internet

Cloudflare launched a free tool to block AI bots in 2024, but this change allows publishers to block them by default, and without altering any settings. Condé Nast, TIME and The Associated Press are just some of the publishers who have signed up to block the crawlers. Cloudflare says over 1 million customers have chosen this option.

Cloudflare adds that a small number of publishers and content creators are participating in a private beta for its pay-per-crawl feature. This will allow those who do allow the bots to scrape their content to set a price for the privilege.

"Each time an AI crawler requests content, they either present payment intent via request headers for successful access (HTTP response code 200), or receive a 402 Payment Required response with pricing," Cloudflare explained.

Anyone interested in becoming part of the beta can sign up here.

Around 16% of global internet traffic goes directly through Cloudflare's CDN, according to a 2023 report, so the move could have a huge impact on AI companies.

"Original content is what makes the Internet one of the greatest inventions in the last century, and it's essential that creators continue making it," said Matthew Prince, CEO of Cloudflare.

"AI crawlers have been scraping content without limits. Our goal is to put the power back in the hands of creators, while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant Internet with a new model that works for everyone."

For pay-per-crawl to work properly, AI companies must also sign up for the program. Cloudflare says that it has partnered with several AI firms willing to participate in what should be a mutually beneficial arrangement – assuming they agree to pay the prices set by publishers.

The news comes just a couple of weeks after Prince reiterated his previous warning that AI crawlers and summaries were destroying the internet's business model. Default blocking and pay-per-crawl are part of the company's plan to combat the threat of a zero-click internet, a term describing when users no longer need to click on links to find whatever content they want.

In the past, websites typically saw one human visitor for every six times Google crawled their pages – a relatively balanced ratio that often translated into ad views. By comparison, OpenAI's crawler had a much lower engagement rate of about one visitor per 250 crawls, while Anthropic's ratio was even steeper at roughly 6,000 to one. According to Prince, those gaps have widened: Google now averages around 18 crawls per visitor, OpenAI's rate has dropped to 1,500 to one, and Anthropic's is estimated at a staggering 60,000 to one.

Permalink to story:

 
So they've been lawlessly scraping the whole net for all data to feed their LLMs for years, and now when they have them successfully launched, they simply gonna close the door for others behind.
 
Back