Cloudflare, a network services company, has introduced a new feature to help web hosting customers block AI bots from scraping their website content without permission. This move is in response to customer concerns about AI bots and the desire to maintain a safe internet for content creators. The new feature, a one-click option to block all AI bots, is in addition to the existing method of using a robots.txt file to block bots.
The robots.txt file, placed in a website’s root directory, is a widely used method for web crawlers to comply with directives that tell them to stay out. However, the effectiveness of this method can be questionable, as bots may ignore the directives without consequences.
Recent reports suggest that some AI bots disregard the robots.txt file, such as the case of Perplexity, an AI search outfit, which was accused of scraping websites without suitable credit or permission. Amazon, a cloud services provider, is investigating this matter.
Cloudflare’s new feature aims to provide a more robust barrier to bot entry. The company stated that AI bots are now flooding the internet, visiting about 39 percent of the top one million web properties served by Cloudflare. To detect these bots, Cloudflare uses a machine-learning scoring system that rates the bots based on their digital fingerprint, which includes technical details read through network interactions.
The company acknowledges that some AI companies may attempt to circumvent rules to access content, but they will continue to monitor the situation and evolve their machine learning models to keep the internet a place where content creators can control their content. The new feature is available for free tier customers, and all they have to do is click the “Block AI Scrapers and Crawlers” toggle button in the Security -> Bots menu for a given website.