Cloudflare introduces permission-based internet scraping for AI crawlers, signalling a new business model

Tuesday, July 1, 2025
Mixed response to solution
Dr. Kolochenko, CEO at ImmuniWeb and a Fellow at the British Computer Society (BCS), released a statement about the news, unpicking the pros and cons of the development.
He said: “This long-awaited feature by Cloudflare is a true disaster for many GenAI vendors, which may be fatal to the current business models of GenAI. Given that Cloudflare protects the majority of the world’s most popular websites, as well as millions of smaller websites that publish academic and scientific content, this security feature will elegantly prevent data-greedy bots from unwarrantedly scraping human-created content without permission and without paying for it.
“Ironically, the fierce legal battles currently taking place in courts on both sides of the Atlantic – disputing the alleged copyright infringements by numerous AI vendors – are mostly re-litigating arguments that are already lost. At the end of the day, these lawsuits will bring from little to no value to GenAI vendors: virtually all creative content providers are incrementally protecting their content with advanced anti-bot protection mechanisms, which Cloudflare has just made available to everybody in one click. Furthermore, content providers add specific contractual provisions to their terms of service that expressly prohibit any use of their data for LLM training purposes.
“In case of a violation of such terms of service, content providers will have a straightforward and time-tested legal claim for breach of contract, possibly accompanied with liquidated damages per violation, making such claims extremely lucrative for the plaintiffs. Furthermore, in some jurisdictions, a deliberate bypass of anti-bot protection and massive data scraping may constitute a criminal offense. Of note, all this has virtually nothing to do with copyright law. Ultimately, GenAI vendors – that now vigorously argue in courts that exploitation of third-party content for LLM training purposes constitutes a fair use exception under the copyright law – will likely face even greater liability under the avalanche of breach of contract claims.”
Dr. Kolochenko added: “In sum, most GenAI vendors will soon face a tough reality: paying a fair price for high-quality training data, while staying profitable. In view of the formidable competition emanating from China, many Western GenAI companies may simply quit the business as economically unviable.” Read Full Article
Infosecurity Magazine: Cloudflare Now Blocks AI Web Scraping by Default
IT PRO: Supplier hack leaks UBS data – including CEO's phone number