robots.txt

Plain text file at /robots.txt that tells crawlers which paths they may or may not fetch.

Definition

robots.txt is a plain text file at the root of a domain that signals to web crawlers which paths they may visit. The format is line-based: User-agent declares which bot the following rules apply to, then Allow and Disallow directives specify paths. Honored by good actors (Google, OpenAI, Anthropic, Perplexity); ignored by hostile scrapers. Critical for controlling AI crawler access without authentication.

When to use

Edit robots.txt when you want to allow or block specific bots. For AI search visibility, allow OAI-SearchBot, PerplexityBot, and ClaudeBot; for training opt-out, block GPTBot and Google-Extended.

See also