🤖 robots.txt Generator
Build a robots.txt file visually. Configure allow/disallow rules per user-agent, set crawl delay, and add sitemap URL. Free online robots.txt generator for SEO.
Presets:
How to Use
1
Choose a preset
Start with a preset: Allow All, Block All, Block AI Bots (GPTBot, ClaudeBot, etc.), or SEO-Friendly.
2
Customize the rules
Add or remove User-agent rules and Disallow/Allow paths using the rule builder below the presets.
3
Download your file
Add your sitemap URL (optional), then click Download to save your robots.txt file.
Frequently Asked Questions
What is a robots.txt file?
robots.txt is a file placed at the root of your website (example.com/robots.txt) that tells web crawlers which pages they can or cannot request. It follows the Robots Exclusion Protocol and is the first thing most crawlers fetch.
Does robots.txt prevent pages from being indexed?
No — robots.txt prevents crawling, not indexing. If other pages link to a disallowed URL, Google can still index it without crawling it. To prevent indexing, use the noindex meta tag or X-Robots-Tag header.
How do I block AI training bots?
Use the "Block AI Bots" preset. Common AI crawlers include: GPTBot (OpenAI), CCBot (Common Crawl), Google-Extended (Google AI), anthropic-ai (Anthropic), and ChatGPT-User. Add User-agent: BotName / Disallow: / for each.
What does "Disallow: /" mean?
Disallow: / blocks all pages on the site for that user-agent. Disallow: /admin/ blocks only the /admin/ directory. Disallow: (empty) means allow everything. Allow: /public/ within a blocked section creates an exception.
Is robots.txt case-sensitive?
Paths in robots.txt are case-sensitive on case-sensitive servers (most Linux servers). User-agent names are case-insensitive. So Disallow: /Admin/ and Disallow: /admin/ may be different paths.
完整指南:Robots.txt生成器
什么是robots.txt?
robots.txt告诉搜索引擎爬虫哪些部分可以抓取。这只是建议——不是安全机制。用于排除管理页面、内部搜索结果和API端点。
如何使用
- 按用户代理定义规则(Googlebot、
*代表全部)。 - 添加Disallow路径。
- 在末尾包含网站地图URL。
- 上传到域名根目录。
专业技巧
Disallow: /封锁整个网站——部署前仔细检查。- 用robots.txt封锁不会删除已索引的页面——使用
noindex。 - Google忽略
Crawl-delay——使用Search Console限制爬取速度。