LLMs.txt Optimization for AI Discovery
A hand-authored llms.txt at the root of your domain that gives AI crawlers and AI agents a curated map of what your business actually does. And a robots.txt that explicitly welcomes the crawlers that read it.
Overview
Search engines have had robots.txt for thirty years and sitemap.xml for nearly twenty. AI engines, until recently, had to figure out a site by crawling it. llms.txt. Proposed by Jeremy Howard in late 2024 and adopted by a growing number of major documentation, tooling, and content sites. Is the AI-era equivalent: a curated, markdown-formatted file at the root of a domain that says "here's what this site is, here are the pages that matter, in this order, with this context."
Adoption isn't universal yet. Some AI crawlers explicitly look for llms.txt; others don't. But two things are true regardless: the cost of shipping one is essentially zero, and the file is read by AI agents users point at your site (Claude with browsing, custom RAG setups, agent-style research tools) even when no major AI crawler reads it directly.
Our llms.txt service hand-authors the file, ships an AI-aware robots.txt alongside it, and (for content-heavy sites) optionally delivers a longer llms-full.txt with the actual content of priority pages in markdown.
What is llms.txt?
llms.txt is a proposed standard for a markdown file served at the root of a domain (e.g., https://example.com/llms.txt). Its structure is defined: an H1 with the site or project name, a blockquote with a one-sentence summary, optional context paragraphs, then H2 sections containing curated links to the most important pages on the site, each with a short description.
The format is designed to be parseable by humans and large language models alike, prioritizes signal over completeness, and is meant to be hand-curated rather than auto-generated. Think of it as a hand-written README for your entire site, written for an AI to read directly.
How we work
- Site taxonomy reviewMap the site's actual content surface. Services, sub-services, hub topics, key content pieces. The llms.txt structure follows this taxonomy, so getting it right matters more than the writing.
- Curated section designDecide which sections appear in the llms.txt and in what order. Most sites have 4 to 8 sections (Core Services, Sub-services per area, Resources, About). Each section gets curated links with one-line descriptions.
- Authoring and reviewThe file is written by hand, reviewed against the official spec, and validated as parseable markdown. Average length: 60 to 120 lines. Anything longer is usually a sign it should be in llms-full.txt instead.
- AI-aware robots.txtUpdated robots.txt explicitly allowing GPTBot, ClaudeBot, anthropic-ai, Google-Extended, PerplexityBot, CCBot, Applebot-Extended, and other AI crawlers. Sites that block AI crawlers by accident are not unusual; we fix that.
- Versioning and maintenancellms.txt is committed alongside your site code, gets a version dated in a comment header, and is updated whenever services change. Quarterly review is included in any maintenance retainer.
What this service includes
- Site taxonomy review and section design
- Hand-authored llms.txt at /llms.txt
- Optional llms-full.txt for content-heavy sites
- AI-aware robots.txt with explicit crawler allows
- Versioned, comment-headered, parseable file
- Markdown validation against the llms.txt spec
- Linked-page reachability and 200-status check
- Quarterly review included in maintenance retainers
llms.txt vs. sitemap.xml vs. robots.txt
| robots.txt | sitemap.xml | llms.txt | |
|---|---|---|---|
| Audience | All crawlers | Search-engine crawlers | AI engines and agents |
| Format | Plain text rules | XML | Markdown |
| Purpose | Access rules | URL inventory | Curated overview |
| Curation | None | Comprehensive (every URL) | Highly curated (priority URLs) |
| Adoption (2025) | Universal | Universal | Growing |
Engagement example
A specialty B2B services firm had no llms.txt and a robots.txt that (by oversight) blocked GPTBot and ClaudeBot, which meant the site was effectively invisible to those crawlers' direct access. We hand-authored a 95-line llms.txt covering their service taxonomy, fixed the robots.txt to explicitly allow major AI crawlers, and added a quarterly-review note to their maintenance retainer.
Representative engagement. Client identity withheld for privacy.
Frequently asked questions
No llms.txt yet? Want to ship one this month?
Send your URL. We'll hand-author an llms.txt against your service taxonomy and ship it alongside an updated robots.txt. Typically inside two weeks.