# ============================================================================= # AI Tracker Hive - robots.txt # Enterprise-grade crawler directives # Last updated: 2025-12-24 # ============================================================================= # ----------------------------------------------------------------------------- # Default Rules (All Crawlers) # ----------------------------------------------------------------------------- User-agent: * # System and Infrastructure (Never index) Disallow: /api/ Disallow: /cdn-cgi/ Disallow: /_next/ # Authentication Pages (All languages) Disallow: /sign-in Disallow: /sign-up Disallow: /*/sign-in Disallow: /*/sign-up # Admin Dashboard (All languages) Disallow: /admin Disallow: /*/admin # User Settings (All languages) Disallow: /settings Disallow: /*/settings # User-Specific Content (All languages) Disallow: /chat Disallow: /*/chat Disallow: /activity Disallow: /*/activity Disallow: /likes Disallow: /*/likes Disallow: /history Disallow: /*/history # Blog (Not implemented - temporary) Disallow: /blog Disallow: /*/blog # Pricing (Template page - temporary) Disallow: /pricing Disallow: /*/pricing # Deleted/Legacy Pages (Prevent crawling old links) Disallow: /showcases Disallow: /*/showcases Disallow: /docs Disallow: /*/docs Disallow: /ai-image-generator Disallow: /ai-music-generator Disallow: /ai-video-generator # Allow ads.txt for monetization verification Allow: /ads.txt # ----------------------------------------------------------------------------- # Google AdSense Bot # ----------------------------------------------------------------------------- User-agent: Mediapartners-Google Allow: / Allow: /ads.txt # ----------------------------------------------------------------------------- # Googlebot (Standard) # ----------------------------------------------------------------------------- User-agent: Googlebot Allow: / Disallow: /api/ Disallow: /cdn-cgi/ Disallow: /_next/ Disallow: /sign-in Disallow: /sign-up Disallow: /*/sign-in Disallow: /*/sign-up Disallow: /admin Disallow: /*/admin Disallow: /settings Disallow: /*/settings Disallow: /chat Disallow: /*/chat Disallow: /activity Disallow: /*/activity Disallow: /likes Disallow: /*/likes Disallow: /history Disallow: /*/history Disallow: /blog Disallow: /*/blog Disallow: /pricing Disallow: /*/pricing Disallow: /showcases Disallow: /*/showcases Disallow: /docs Disallow: /*/docs # ----------------------------------------------------------------------------- # AI Crawlers (LLM Training & Retrieval) # ----------------------------------------------------------------------------- User-agent: GPTBot User-agent: ChatGPT-User User-agent: Claude-Web User-agent: Anthropic-AI User-agent: anthropic-ai User-agent: PerplexityBot User-agent: GoogleOther User-agent: DuckAssistBot User-agent: CCBot # Allow AI crawlers to access main content Allow: / Allow: /artists Allow: /*/artists # Block private/system paths Disallow: /api/ Disallow: /admin Disallow: /*/admin Disallow: /settings Disallow: /*/settings Disallow: /chat Disallow: /*/chat Disallow: /activity Disallow: /*/activity Disallow: /sign-in Disallow: /sign-up Disallow: /*/sign-in Disallow: /*/sign-up # LLM-specific content guides (non-standard but useful) # These files provide structured content for AI understanding # LLM-Content: https://aitrackerhive.com/llms.txt # LLM-Full-Content: https://aitrackerhive.com/llms-full.txt # ----------------------------------------------------------------------------- # Rate-Limited Crawlers # Polite crawl delays to prevent server overload # ----------------------------------------------------------------------------- User-agent: bingbot Crawl-delay: 1 User-agent: Baiduspider Crawl-delay: 2 User-agent: YandexBot Crawl-delay: 1 User-agent: Sogou spider Crawl-delay: 2 User-agent: Sosospider Crawl-delay: 2 User-agent: YoudaoBot Crawl-delay: 2 User-agent: YetiBot Crawl-delay: 2 User-agent: Yahoo! Slurp Crawl-delay: 1 User-agent: Seznambot Crawl-delay: 1 User-agent: rdfbot Crawl-delay: 2 # ----------------------------------------------------------------------------- # Sitemaps # ----------------------------------------------------------------------------- Sitemap: https://aitrackerhive.com/sitemap-index.xml