# As a condition of accessing this website, you agree to abide by the following # content signals: # (a) If a Content-Signal = yes, you may collect content for the corresponding # use. # (b) If a Content-Signal = no, you may not collect content for the # corresponding use. # (c) If the website operator does not include a Content-Signal for a # corresponding use, the website operator neither grants nor restricts # permission via Content-Signal with respect to the corresponding use. # The content signals and their meanings are: # search: building a search index and providing search results (e.g., returning # hyperlinks and short excerpts from your website's contents). Search does not # include providing AI-generated search summaries. # ai-input: inputting content into one or more AI models (e.g., retrieval # augmented generation, grounding, or other real-time taking of content for # generative AI search answers). # ai-train: training or fine-tuning AI models. # ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF # RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT # AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET. # BEGIN Cloudflare Managed content User-agent: * Content-Signal: search=yes,ai-train=no Allow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: CloudflareBrowserRenderingCrawler Disallow: / User-agent: Google-Extended Disallow: / User-agent: GPTBot Disallow: / User-agent: meta-externalagent Disallow: / # END Cloudflare Managed Content # HobbyCardIndex.com, Bot Policy # Last updated: 2026-04-18 (INFRA-001, SEO agent) # # Policy summary: # - Allow general search crawlers and AI crawlers to index all public content. # - Explicitly allow the AI bots that send referral traffic and citations # (ChatGPT, Perplexity, Claude, Google Gemini, Apple Intelligence, etc.). # - Keep authenticated app surfaces, admin tooling, and internal utility # pages out of the index. # # The HCI backend server-side-renders crawlable HTML for /cards/:id, # /sets/:slug, /players/:slug, /prospects/:slug, /teams/:slug, /years/:slug # via server/seo/renderer.js, so these SPA routes are safe to allow. # --------------------------------------------------------------- # Default policy for all crawlers (generic search + anything else) # --------------------------------------------------------------- User-agent: * Allow: / Allow: /sitemap.xml Allow: /sitemap-static.xml Allow: /sitemap-hubs.xml Allow: /sitemap-guides.xml Allow: /sitemap-reports.xml Allow: /sitemap-answers.xml Allow: /sitemap-compare.xml Allow: /hubs/ Allow: /guides/ Allow: /compare/ Allow: /reports/ Allow: /answers/ Allow: /sports/ Allow: /years/ Allow: /teams/ Allow: /prospects/ Allow: /alternatives/ Allow: /about/ Allow: /legal/ Allow: /help-center/ Allow: /ja/help-center/ Allow: /hci-logo.png Allow: /hci-logo.svg Allow: /og-image.png Allow: /llms.txt Allow: /llms-full.txt # Authenticated app routes (user data, account surfaces) Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /account-deletion.html # Internal utility pages (not meant for search) Disallow: /capture.html Disallow: /screenshot-upload.html # Crawl hint, be polite but don't starve us. Crawl-delay: 1 # --------------------------------------------------------------- # AI crawlers, explicitly ALLOWED # HCI wants to be cited by ChatGPT, Claude, Perplexity, Gemini, Apple # Intelligence, and Common Crawl. The rules below mirror the default # policy for each named agent so the crawl is identical to a search bot. # --------------------------------------------------------------- # OpenAI, training crawler User-agent: GPTBot Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # OpenAI, user-initiated ChatGPT browsing User-agent: ChatGPT-User Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # OpenAI, SearchGPT / OpenAI search index User-agent: OAI-SearchBot Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Anthropic, Claude training and research crawler User-agent: anthropic-ai Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Anthropic, production Claude crawler User-agent: ClaudeBot Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Anthropic, Claude browsing tool (user-initiated) User-agent: Claude-Web Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Google, Gemini / Vertex AI training crawler User-agent: Google-Extended Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Perplexity, answer engine crawler User-agent: PerplexityBot Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Perplexity, user-initiated browsing User-agent: Perplexity-User Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Common Crawl, public web corpus used by many LLMs User-agent: CCBot Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Apple, Apple Intelligence training crawler User-agent: Applebot-Extended Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # Meta, AI training crawler User-agent: Meta-ExternalAgent Allow: / Disallow: /api/ Disallow: /admin Disallow: /admin/ Disallow: /admin-waitlist.html Disallow: /collection Disallow: /collection/ Disallow: /dashboard Disallow: /dashboard/ Disallow: /settings Disallow: /settings/ Disallow: /watchlist Disallow: /watchlist/ Disallow: /account Disallow: /account/ Disallow: /capture.html Disallow: /screenshot-upload.html # --------------------------------------------------------------- # Crawlers we still BLOCK # These are aggressive scrapers or low-signal crawlers that add # server load without returning referral traffic or citations. # --------------------------------------------------------------- # ByteDance / TikTok scraper, aggressive, no referral value User-agent: Bytespider Disallow: / # Huawei Petal search, aggressive, minimal traffic User-agent: PetalBot Disallow: / # Amazon, no referral value, aggressive scraping User-agent: Amazonbot Disallow: / # --------------------------------------------------------------- # Sitemaps # --------------------------------------------------------------- Sitemap: https://hobbycardindex.com/sitemap.xml Sitemap: https://hobbycardindex.com/sitemap-static.xml Sitemap: https://hobbycardindex.com/sitemap-hubs.xml Sitemap: https://hobbycardindex.com/sitemap-guides.xml Sitemap: https://hobbycardindex.com/sitemap-reports.xml Sitemap: https://hobbycardindex.com/sitemap-answers.xml Sitemap: https://hobbycardindex.com/sitemap-about.xml Sitemap: https://hobbycardindex.com/sitemap-compare.xml