In this article
July 10, 2025
July 10, 2025

From blocking bots to optimizing for LLMs: How the web flipped its script

Not long ago, we worked hard to keep bots off our websites. Today, we’re optimizing for them, especially LLMs like GPT and Claude. Here's how companies are opening up their content, while still fighting abuse where it counts.

Not long ago, we wrote a guide on How to Stop Bots. It walked through how to protect your app from automated scraping, fraud, and abuse. The premise was clear: bots are bad, and we must keep them out.

Fast-forward to today, and the web feels like it’s undergoing a quiet, but massive, reversal.

Now, companies are not just allowing bots in. We’re competing to make our websites, products, and content more attractive to them. Why? Because these aren’t just bots. They’re agents. They’re LLMs. And increasingly, they’re how your users find you, learn about you, and decide to trust you.

Yesterday’s enemy: Why we blocked bots

For decades, bots represented risk:

  • Scrapers trying to steal proprietary content.
  • Fraudsters testing stolen credentials.
  • SEO spammers inflating page views.
  • Competitors reverse-engineering your products.

So we fought back with CAPTCHAs, rate limits, bot detection tools, and tight robots.txt rules. We shaped a web that was safe for humans, and hostile to machines.

And rightly so. Abuse is real. Credential stuffing and bot-based fraud still pose serious threats, especially for B2B platforms and SaaS products.

At WorkOS, we’ve seen how coordinated automated attacks can hit auth flows and sign-up funnels. It’s one reason we built Radar, to give teams visibility into abuse patterns and automated behaviors at the edge.

But even as we defend against bad automation, there’s a new kind of bot that has emerged, and it’s not just harmless. It’s valuable.

Today’s ally: Why we’re now inviting AI

The bots we now welcome are not trying to break things. They’re trying to understand them.

Large Language Models (LLMs), like ChatGPT, Claude, Perplexity, and others, are rapidly becoming new entry points to the web. These tools crawl, index, and synthesize content to answer questions, recommend tools, and explain concepts. And increasingly, users trust them.

If you’ve ever had a user say “I found you via ChatGPT,” you know this shift is real.

For modern products, this presents a new surface area:

  • Are LLMs able to access your docs and understand your API?
  • Can they summarize your value prop accurately?
  • Do they recommend you or a competitor?

Companies that used to block crawlers are now trying to be the canonical source these agents trust.

The optimization arms race

As LLMs and AI agents become new discovery layers for the internet, we’re entering a kind of second SEO era — but this time, it’s not about search engines, it’s about being correctly understood, cited, and surfaced by models.

This shift is triggering a new optimization arms race, and the winners will be those who make their content, documentation, and product maximally accessible to machines.

Here’s what that looks like in practice:

  • Structured markup everywhere: Tools like schema.org, Open Graph, and JSON-LD aren’t just for Google anymore. LLMs rely on structured metadata to infer meaning and context. For example, explicitly marking something as a "Product" or "APIReference" helps AI agents route users to you more confidently.
  • Semantic consistency in docs: Docs that follow consistent patterns, like Overview → Authentication → API Reference → Examples, are easier for models to interpret and summarize. Some dev-focused companies are even versioning their docs in machine-optimized markdown alongside their main site.
  • Semantic APIs with self-describing endpoints: Developers are making their APIs easier to interpret by:
    • Publishing OpenAPI or GraphQL introspection specs
    • Including example requests and responses in machine-readable formats (like markdown or JSON)
    • Hosting /api or /ai paths specifically designed for tool use (e.g., auto-discoverable endpoints for agent integrations)
    • Some teams are even adding lightweight wrappers or metadata layers on top of existing APIs to support LLM-friendly workflows, such as JSON schema validation or context-enriched responses.
  • Making authentication and onboarding legible to agents: Authentication flows, like SSO onboarding, API key provisioning, or workspace setup, are being adapted for non-human consumers. This is especially important in developer tools and SaaS platforms, where AI agents may act as intermediaries helping users connect services, debug errors, or automate tasks. For example:
    • Some apps now surface preconfigured OAuth scopes or tokenless sandbox modes for AI agents to explore product behavior.
    • Workflows like "create a test user", "send a webhook", or "fetch usage stats" are increasingly being documented and modeled as agent flows, with CLI commands, curl snippets, or even hosted agent guides.
  • Readable, parseable UI copy: UI surfaces, onboarding flows, and feature pages are being rewritten to be more GPT-friendly, featuring short sentences, clear headings, and fewer ambiguous phrases. Even product modals are being designed with machine interpretability in mind.
  • AI-oriented product metadata: Forward-leaning platforms are adding internal metadata fields like “LLM summary”, “LLM instructions”, or “Agent display name” into their product models. This isn’t shown to users — it’s shown to models. Think of it as meta-SEO for machines.

Just as we once optimized for mobile, and then accessibility, we’re now optimizing for machine fluency.

The question isn’t just “Can a human understand your product?” It’s “Can an agent understand how to use it — and explain it to others?”

From gatekeeping to invitations

For most of the web’s history, we built walls:

  • robots.txt to keep scrapers out.
  • Rate limits to prevent overload.
  • CAPTCHAs to stop abuse.

However, we're now witnessing a quiet reversal: the same mechanisms are being employed to invite trusted AI agents in.

Take robots.txt for example:

	
User-agent: GPTBot
Allow: /
	

Similar allowlist entries are popping up for ClaudeBot, PerplexityBot, and others. In some cases, developers are creating AI-specific content paths like /for-llms, /api-index, or /machine-summary.json to make things even easier for crawlers.

But it goes beyond web crawlers. Some companies are starting to:

  • Expose AI-dedicated endpoints for prompt-chaining agents to interact with apps.
  • Create public-facing embeddings of their docs or knowledge base for retrieval-augmented generation (RAG) pipelines.
  • Use fingerprinting techniques to distinguish helpful agents (LLMs, crawlers) from adversarial ones (scrapers, brute-forcers) — inviting one while blocking the other.

In essence, we’ve gone from a posture of gatekeeping to one of selective openness.

The future web isn’t just human-readable. It’s machine-welcoming, and who you let in (or keep out) is now a strategic choice.

But abuse still exists. And it’s evolving.

Opening up to LLMs doesn’t mean letting your guard down. In fact, the line between helpful bots and harmful ones is blurrier than ever.

It’s possible — and increasingly common — to optimize for good AI agents while still defending against credential stuffing, fake signups, and scraping abuse. That’s why platforms need both:

  • A clear policy of who can access content (via robots.txt, headers, and fingerprinting).
  • A defense system to see and block abuse in real time.

Tools like WorkOS Radar give teams visibility into login attempts, credential reuse, and suspicious automation. You can automatically block threats, distinguish real users from bad actors, and defend against free trial abuse. You can welcome LLMs while still keeping attackers out.

Modern bot strategy isn’t black and white — it’s selective.

The era of designing for human and machines

We’re at the beginning of a new chapter in web development.

We used to ask: What’s the user experience like?

Now we must also ask: What’s the agent experience like?

If ChatGPT reads your website, what does it walk away with?

If Perplexity indexes your docs, do you control the narrative?

You don’t have to choose between protecting your site and inviting in the future. You just have to design for both.

And the companies who do will win the next generation of discovery.

This site uses cookies to improve your experience. Please accept the use of cookies on this site. You can review our cookie policy here and our privacy policy here. If you choose to refuse, functionality of this site will be limited.