Insight
AI Bot Crawling: Balancing Security Risk with Search Visibility.
The growth of AI bot traffic is changing the rules of the web. Is your platform ready to manage both the risk and the opportunity?
AI-driven bots are now a permanent feature of the modern web. For organisations managing complex website estates, the question is no longer whether these bots exist, it’s how to respond to them without compromising performance, security or long-term visibility.
These bots don’t behave like traditional search engine crawlers that politely index content over time. Instead, they often bombard sites with high-frequency requests, sometimes ignoring directives such as robots.txt, which are intended to guide or restrict automated access to certain areas of a site.
This surge in aggressive crawling is commonly linked to AI companies harvesting large volumes of data to train or refine their models. While this activity is rarely malicious in intent, the scale and speed involved can place serious strain on website infrastructure. In practice, this can degrade performance, increase bandwidth costs, and in some cases disrupt access for legitimate users.
When “Polite Web” Assumptions No Longer Hold
The impact of AI bot activity is particularly challenging because many of the long-standing assumptions about how the web operates are beginning to break down.
Even bots that throttle themselves or broadly respect crawling conventions can still cause issues when operating at scale. More concerning are those that ignore these norms altogether. We are seeing sharp spikes in requests that resemble distributed crawling campaigns and, at first glance, can look similar to a denial-of-service attack. Not because they are hostile, but because they are driven by intensive data-gathering.
This creates both operational risk, in terms of servers being overwhelmed, and wider legal or ethical concerns around content use, consent, and transparency.
Designing for Bot Resilience
From a security perspective, this shift is now widely recognised. The UK’s National Cyber Security Centre (NCSC) explicitly acknowledges that online services must be designed with bot resilience in mind. In its guidance on ‘Building and operating a secure online service’, the NCSC highlights “protecting your service from bots” as a key consideration.
This isn’t just about reacting when something goes wrong. It requires proactive architectural measures, including the ability to detect abnormal bot patterns, apply rate limiting, and design systems that can absorb or mitigate high volumes of automated requests without compromising legitimate traffic.
At Reading Room, this is something our combined website hosting services and website security services address using Cloudflare’s tooling. Combining rate limiting, bot management and AI bot controls for reactive issues, alongside DDoS detection and protection for proactive resilience.
The Scale of AI Bot Activity Is Growing
This is not an isolated issue. ‘Cloudflare’s Year in Review 2025’ highlights several relevant trends seen across the internet:
- Global internet traffic grew by 19% during 2025.
- A significant and growing portion of that traffic now comes from bots, many designed to train AI models.
- GoogleBot generated the highest volume of verified bot traffic.
- 6% of traffic attempting to traverse Cloudflare’s global network was automatically mitigated as potentially malicious or blocked by rules.
- The US remains the largest source of bot traffic.
Together, these trends reinforce what we’re seeing day-to-day: AI bots are no longer occasional visitors; they are a persistent, high-volume presence.
Not All AI Bots Are the Same
It’s important to recognise that AI bot traffic isn’t uniform. There is a clear distinction between:
- AI chat session crawlers, which are triggered when real users interact with tools like ChatGPT, Claude or Gemini. These bots access content in response to legitimate user queries. In many cases, allowing controlled access to this type of crawler can support discoverability and relevance in AI-assisted search experiences.
- AI training bots, which are designed to scrape large volumes of content to train language models. These crawlers are more likely to generate aggressive traffic patterns and raise concerns around performance, attribution, and intellectual property. As a result, organisations often choose to restrict or block this type of traffic.
Treating all AI bots the same (either allowing everything or blocking everything) rarely delivers the best outcome. As leading providers of SEO services, we advocate for a more selective approach which enables organisations to protect performance and content, while remaining open to legitimate use.
The Visibility Consideration
There are also merits to allowing appropriate AI access. As AI-assisted search becomes more prevalent, ensuring your site can be indexed and referenced helps maintain visibility and relevance online.
Research shared by Search Engine Land suggests that allowing AI crawlers access to most public content is often a net benefit, as it increases the likelihood of that content being surfaced in AI search experiences. If referenced prominently, this visibility can improve brand perception, and therefore conversions. However, it also explains that truly unique intellectual property should remain protected behind logins or paywalls.
This shift is accelerating. McKinsey’s analysis reports that roughly 50% of Google searches already include some form of AI-generated summary, such as AI Overviews, and that AI-assisted search is expected to grow to around 75% by 2028. Blocking AI bots entirely could therefore leave brands behind, making AI visibility an important consideration in long-term digital strategy.
A Balanced, Practical Approach
In summary, AI bots are no longer passive indexers; they are active, high-volume participants in the web ecosystem.
At Reading Room, we proactively defend against disruptive traffic as much as possible, but this activity is often unpredictable. In some cases, reactive action is required to stabilise a site experiencing unusually high AI-driven traffic.
Our aim is to provide resilient, stable architectures while remaining open to legitimate use. A blanket block on all AI bots is rarely the right answer. Instead, we focus on building systems that can protect performance and security, while still supporting long-term visibility as the web continues to evolve.
By monitoring traffic patterns, enforcing sensible access controls, and designing for resilience from the outset, we help clients navigate this changing landscape: protecting their resources without closing the door on opportunity.
Concerned about AI-driven traffic and search visibility?
Our team designs resilient platforms that balance security, performance and discoverability.