Cloudflare Announces New Content Scraping Protection Feature; “Easy Button” Stops AI Bots With a Click

By Scott Ikeda for CPO Magazine
Tuesday, July 8, 2025

1.2k
More

The outcome of this particular suit could be the most consequential in setting precedent for emerging AI copyright law in the US. The AI companies generally defend content scraping by claiming that it falls under the parameters of the “fair use” doctrine, and the AI bots merely access public information in a way that any other average internet user might. But a key to a successful fair use defense is demonstrating a “transformative” quality and keeping to within a certain limited amount of the original content, where the AI outfits may wind up in legal trouble if models regularly regurgitate significant portions of articles without providing source credits.

Dr. Kolochenko, CEO at ImmuniWeb, believes that the “pay to scrape” model will greatly expand in the coming months and could prove to be a significant obstacle for the AI outfits: “This long-awaited feature by Cloudflare is a true disaster for many GenAI vendors, which may be fatal to the current business models of GenAI. Given that Cloudflare protects the majority of the world’s most popular websites, as well as millions of smaller websites that publish academic and scientific content, this security feature will elegantly prevent data-greedy bots from unwarrantedly scraping human-created content without permission and without paying for it. Ironically, the fierce legal battles currently taking place in courts on both sides of the Atlantic – disputing the alleged copyright infringements by numerous AI vendors – are mostly re-litigating arguments that are already lost. At the end of the day, these lawsuits will bring from little to no value to GenAI vendors: virtually all creative content providers are incrementally protecting their content with advanced anti-bot protection mechanisms, which Cloudflare has just made available to everybody in one click. Furthermore, content providers add specific contractual provisions to their terms of service that expressly prohibit any use of their data for LLM training purposes. In case of a violation of such terms of service, content providers will have a straightforward and time-tested legal claim for breach of contract, possibly accompanied with liquidated damages per violation, making such claims extremely lucrative for the plaintiffs. Furthermore, in some jurisdictions, a deliberate bypass of anti-bot protection and massive data scraping may constitute a criminal offense. Of note, all this has virtually nothing to do with copyright law. Ultimately, GenAI vendors – that now vigorously argue in courts that exploitation of third-party content for LLM training purposes constitutes a fair use exception under the copyright law – will likely face even greater liability under the avalanche of breach of contract claims. In sum, most GenAI vendors will soon face a tough reality: paying a fair price for high-quality training data, while staying profitable. In view of the formidable competition emanating from China, many Western GenAI companies may simply quit the business as economically unviable.” Read Full Article

Previous Media Publications:

Forbes: Cloudflare Sidesteps Copyright Issues, Blocking AI Scrapers By Default

SiliconANGLE: New Cloudflare feature lets websites charge AI developers for content access

1.2k
More