Skip to content

henu-wang/ai-robots-txt-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

AI Robots.txt Generator

A simple tool to generate an optimized robots.txt file that properly configures access for AI search engine crawlers like ChatGPT (GPTBot), Perplexity (PerplexityBot), Google Gemini, Claude, and more.

Online Tools

Prefer a web-based tool? Try our free online generators β€” no installation needed:

Why This Matters

Many websites accidentally block AI search engine crawlers, making their content invisible to ChatGPT, Perplexity, and other AI-powered search engines. This tool helps you generate a properly configured robots.txt that:

Quick Start

Option 1: Use the Web Tool

Check your current robots.txt configuration: GEOScore Robots.txt Checker

Option 2: Use the Script

# Clone this repo
git clone https://github.com/henu-wang/ai-robots-txt-generator.git
cd ai-robots-txt-generator

# Generate robots.txt
python generate.py --domain yourdomain.com --output robots.txt

Option 3: Copy the Template

Copy the template below and customize it for your site:

# ============================================
# AI Search Engine Crawlers Configuration
# Generated by: https://github.com/henu-wang/ai-robots-txt-generator
# Guide: https://geoscoreai.com/blog/robots-txt-ai-crawlers
# ============================================

# OpenAI - ChatGPT & SearchGPT
User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: OAI-SearchBot
Allow: /

# Perplexity AI
User-agent: PerplexityBot
Allow: /

# Google - Gemini & AI Overviews
User-agent: Google-Extended
Allow: /

# Anthropic - Claude
User-agent: ClaudeBot
Allow: /

# Apple - Apple Intelligence & Siri
User-agent: Applebot-Extended
Allow: /

# Meta AI
User-agent: Meta-ExternalAgent
Allow: /

# Cohere AI
User-agent: cohere-ai
Allow: /

# ByteDance - Doubao
User-agent: Bytespider
Allow: /

# ============================================
# Standard Search Engine Crawlers
# ============================================

User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

User-agent: *
Allow: /

# ============================================
# Sitemap
# ============================================
Sitemap: https://yourdomain.com/sitemap.xml

AI Crawler Reference

Crawler Operator Product Documentation
GPTBot OpenAI ChatGPT docs
ChatGPT-User OpenAI ChatGPT Browse docs
OAI-SearchBot OpenAI ChatGPT Search docs
PerplexityBot Perplexity Perplexity AI docs
Google-Extended Google Gemini/AI Overviews docs
ClaudeBot Anthropic Claude docs
Applebot-Extended Apple Apple Intelligence docs
Meta-ExternalAgent Meta Meta AI docs
cohere-ai Cohere Cohere AI -
Bytespider ByteDance Doubao/TikTok -

Usage

python generate.py --help

Basic Usage

# Generate with default settings (allow all AI crawlers)
python generate.py --domain example.com

# Output to file
python generate.py --domain example.com --output robots.txt

# Block specific crawlers
python generate.py --domain example.com --block GPTBot,Bytespider

# Allow only specific crawlers
python generate.py --domain example.com --allow-only GPTBot,PerplexityBot,Google-Extended

Advanced Options

# Add custom disallow paths
python generate.py --domain example.com --disallow "/admin,/api,/private"

# Include crawl-delay
python generate.py --domain example.com --crawl-delay 10

# Multiple sitemaps
python generate.py --domain example.com --sitemap "https://example.com/sitemap.xml,https://example.com/blog-sitemap.xml"

Check Your Configuration

After deploying your robots.txt, verify it works:

  1. Free scan: GEOScore β€” Scans your entire site for AI search readiness
  2. Robots.txt check: GEOScore Robots.txt Checker β€” Validates AI crawler access
  3. AI crawl check: GEOScore AI Crawl Checker β€” Tests if AI bots can reach your content

Related Resources

Contributing

PRs welcome! If you know of new AI crawlers or have suggestions for the generator, please open an issue or submit a PR.

Related GEO Resources

Free Tools

Open Source Projects

License

MIT License