# Code Pathfinder - robots.txt # Optimized for AI search discoverability (November 2025) # https://codepathfinder.dev # ============================================ # TRADITIONAL SEARCH ENGINES # ============================================ User-agent: Googlebot Allow: / User-agent: Bingbot Allow: / User-agent: DuckDuckBot Allow: / User-agent: Slurp Allow: / User-agent: YandexBot Allow: / # ============================================ # OPENAI CRAWLERS # ============================================ # AI Search - shows pages as links in ChatGPT search (NOT for training) User-agent: OAI-SearchBot Allow: / # User-triggered browsing from ChatGPT conversations User-agent: ChatGPT-User User-agent: ChatGPT-User/2.0 Allow: / # Model training crawler - Allow for better AI understanding of security tools User-agent: GPTBot Allow: / # ============================================ # ANTHROPIC (CLAUDE) CRAWLERS # ============================================ # Chat citation fetch - used when Claude cites sources User-agent: ClaudeBot Allow: / # Web-focused crawl User-agent: claude-web Allow: / # Model training - Allow for Claude to understand security scanning User-agent: anthropic-ai Allow: / # ============================================ # PERPLEXITY CRAWLERS # ============================================ # Index builder for Perplexity AI search User-agent: PerplexityBot Allow: / # Human-triggered visits from Perplexity User-agent: Perplexity-User Allow: / # ============================================ # GOOGLE AI (GEMINI) # ============================================ # Controls access for Gemini AI training User-agent: Google-Extended Allow: / # Google's agentic browser (Project Mariner) User-agent: GoogleAgent-Mariner Allow: / # ============================================ # OTHER AI ASSISTANTS # ============================================ # Amazon Alexa and Fire OS AI features User-agent: Amazonbot Allow: / # Apple Siri and Spotlight User-agent: Applebot User-agent: Applebot-Extended Allow: / # Mistral AI (Le Chat assistant) User-agent: MistralAI-User Allow: / # DuckDuckGo AI answers User-agent: DuckAssistBot Allow: / # You.com AI search User-agent: YouBot Allow: / # Cohere language models User-agent: cohere-ai Allow: / # ============================================ # SOCIAL MEDIA CRAWLERS (for link previews) # ============================================ User-agent: FacebookBot User-agent: meta-externalagent Allow: / User-agent: LinkedInBot Allow: / User-agent: Twitterbot Allow: / User-agent: Bytespider Allow: / # ============================================ # RESEARCH & OPEN DATA CRAWLERS # ============================================ # Academic research (Semantic Scholar) User-agent: AI2Bot Allow: / # Common Crawl - open datasets for AI research User-agent: CCBot Allow: / # Structured data extraction User-agent: Diffbot Allow: / # Decentralized search User-agent: Timpibot Allow: / # ============================================ # DEFAULT RULE # ============================================ User-agent: * Allow: / Disallow: /api/ Disallow: /_next/ Disallow: /private/ # ============================================ # SITEMAP # ============================================ Sitemap: https://codepathfinder.dev/sitemap.xml # ============================================ # LLM-READABLE CONTENT # ============================================ # Machine-readable content for AI assistants # Summary: https://codepathfinder.dev/llms.txt # Full docs + all 190+ rules: https://codepathfinder.dev/llms-full.txt # Per-rule markdown: https://codepathfinder.dev/registry/{language}/{category}/{rule-id}/md