About TakdeBot
TakdeBot is Takde's web crawler that indexes websites to make them discoverable by AI agents.
User-Agent
TakdeBot/1.0 (+https://takde.org/bot)
What TakdeBot Does
- Fetches publicly accessible web pages
- Extracts structured information: name, description, contact details, Schema.org markup
- Generates an agent-readable profile for each website
- Stores a cleaned markdown version of page content
- Checks for agent-readiness files: llms.txt, agent.json, context.txt, robots.txt
What TakdeBot Does NOT Do
- Does not access pages behind login walls or paywalls
- Does not collect personal data beyond what is publicly displayed
- Does not bypass robots.txt restrictions
- Does not submit forms or interact with websites
- Does not store passwords, cookies, or session data
Data Storage
For each indexed website, TakdeBot stores:
- URL and domain
- Page title and meta description
- Structured data from Schema.org JSON-LD (if present)
- Contact information found on the page (phone, email, address)
- A cleaned markdown version of the page content
- Classification (business, blog, government, etc.)
All stored data comes from publicly accessible pages only.
Crawl Behavior
- Respects robots.txt directives
- Respects noindex and noarchive meta tags
- Crawl rate: maximum 1 request per second per domain
- Primary data source: Common Crawl (pre-crawled data), not direct crawling
- Direct crawling only occurs when users scan a URL via the Scanner tool
How to Block TakdeBot
Add to your robots.txt:
User-agent: TakdeBot
Disallow: /
To remove an already indexed profile, visit takde.org/remove.
Contact
Questions about TakdeBot: [email protected]
Removal requests: [email protected]
Data protection: [email protected]