GPTBot
OpenAI's web crawler for future model training data — does not affect ChatGPT live retrieval.
Definition
GPTBot is OpenAI's user-agent for collecting web content used in future model training. It's distinct from OAI-SearchBot (live retrieval for ChatGPT search) and ChatGPT-User (on-demand URL fetches). Blocking GPTBot in robots.txt opts your content out of future training, with no effect on whether ChatGPT can cite you live.
When to use
Block GPTBot if you don't want your content used for model training. Keep OAI-SearchBot and ChatGPT-User allowed — blocking those is what removes you from ChatGPT search results.
See also
- OAI-SearchBot — OpenAI's retrieval crawler that fetches pages for ChatGPT search — block it and you're invisible to ChatGPT.
- ChatGPT — OpenAI's consumer LLM product — chat, browse, and run plugins on the web via GPT-class models.
- robots.txt — Plain text file at /robots.txt that tells crawlers which paths they may or may not fetch.
- AI crawler — Web crawler operated by an AI company — GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended.