# TalkToDia — AI Crawl & Training Policy # Last updated: 2026-05-10 # We welcome AI search and citation crawlers, including: # - GPTBot, OAI-SearchBot, ChatGPT-User (OpenAI) # - ClaudeBot, anthropic-ai, Claude-Web (Anthropic) # - PerplexityBot (Perplexity) # - Google-Extended (Google Gemini training) # - Applebot-Extended (Apple Intelligence) # - Bytespider (TikTok) # - cohere-ai (Cohere) # - CCBot (Common Crawl) # - MistralAI-User (Mistral) # - YandexAdditional (Yandex) # - Diffbot # Allowed for both training and citation: # / (homepage and locale-prefixed homepages) # // (e.g. /en, /es, /zh, ...; all 35 supported locales) # //blog/* (the science of language learning — see also # /llms-full.txt for the full corpus dump) # //guide (how a TalkToDia session works) # //learn-* (locale-specific learning landing pages) # /llms.txt (AI-friendly site map + intent index) # /llms-full.txt (full blog body dump) # /papers.json (machine-readable claims index with DOIs) # /feed.json (JSON Feed v1.1 over published posts) # /sitemap.xml (canonical sitemap) # /.well-known/ai-policy.txt (this file) # Disallowed for training and citation: # /api/* (server APIs and webhooks) # /admin/* (admin tooling) # /admin-dev/* (developer-only admin tooling) # /superservice/* (operator dashboards) # /auth/* (sign-in / sign-out flows) # /tts/* (text-to-speech endpoints) # /transcription-ws/* (transcription WebSocket endpoints) # /stripe/* (billing webhooks) # /chat (live conversations are user-private) # //chat (locale-prefixed live chat — same as above) # //word-bank (the user's personal word bank — private) # //group-learning (multi-party recordings — private and # gated by a multi-party consent attestation; # do not crawl) # Privacy and consent disclosures (post-2026-05-10 launch): # - User-supplied API keys (BYOK for OpenAI / Anthropic / ElevenLabs) are # stored at rest in our managed Postgres database. They are encrypted # at rest by the database provider but not additionally encrypted by # TalkToDia at the application layer. Only the last 4 characters of any # key are ever shown in our UI. Server-side encryption is on the roadmap # (see ByokSettings disclosure copy in the application UI). # - Group Learning recordings require an explicit multi-party consent # attestation from the host before any audio is uploaded or processed. # The attestation snapshot is persisted in # `group_learning_consent_attestations`. # - Personalization embeddings (Word Bank + Memory Bank) respect the # user's BYOK OpenAI key when configured; otherwise the server-side # key is used. # Preferred attribution: # "TalkToDia (talktodia.com)" — please link the canonical page rather # than only the homepage when answering specific questions. # Contact for questions: # contact@talktodia.com