AI Search Optimization

LLMs.txt Optimization for AI Discovery

Q: What's the difference between llms.txt and llms-full.txt?

The standard defines two related files. `llms.txt` is the short, navigable index. H1 site name, blockquote tagline, and curated link sections to the most important pages. `llms-full.txt` is the long-form version that may include the actual content of those pages in markdown form. We typically deliver llms.txt and recommend llms-full.txt only for content-heavy sites where it adds real value.

A hand-authored llms.txt at the root of your domain that gives AI crawlers and AI agents a curated map of what your business actually does. And a robots.txt that explicitly welcomes the crawlers that read it.

Request a Consultation Call (407) 409-8383

Overview

Search engines have had robots.txt for thirty years and sitemap.xml for nearly twenty. AI engines, until recently, had to figure out a site by crawling it. llms.txt. Proposed by Jeremy Howard in late 2024 and adopted by a growing number of major documentation, tooling, and content sites. Is the AI-era equivalent: a curated, markdown-formatted file at the root of a domain that says "here's what this site is, here are the pages that matter, in this order, with this context."

Adoption isn't universal yet. Some AI crawlers explicitly look for llms.txt; others don't. But two things are true regardless: the cost of shipping one is essentially zero, and the file is read by AI agents users point at your site (Claude with browsing, custom RAG setups, agent-style research tools) even when no major AI crawler reads it directly.

Our llms.txt service hand-authors the file, ships an AI-aware robots.txt alongside it, and (for content-heavy sites) optionally delivers a longer llms-full.txt with the actual content of priority pages in markdown.

What is llms.txt?

llms.txt is a proposed standard for a markdown file served at the root of a domain (e.g., https://example.com/llms.txt). Its structure is defined: an H1 with the site or project name, a blockquote with a one-sentence summary, optional context paragraphs, then H2 sections containing curated links to the most important pages on the site, each with a short description.

The format is designed to be parseable by humans and large language models alike, prioritizes signal over completeness, and is meant to be hand-curated rather than auto-generated. Think of it as a hand-written README for your entire site, written for an AI to read directly.

How we work

Site taxonomy reviewMap the site's actual content surface. Services, sub-services, hub topics, key content pieces. The llms.txt structure follows this taxonomy, so getting it right matters more than the writing.
Curated section designDecide which sections appear in the llms.txt and in what order. Most sites have 4 to 8 sections (Core Services, Sub-services per area, Resources, About). Each section gets curated links with one-line descriptions.
Authoring and reviewThe file is written by hand, reviewed against the official spec, and validated as parseable markdown. Average length: 60 to 120 lines. Anything longer is usually a sign it should be in llms-full.txt instead.
AI-aware robots.txtUpdated robots.txt explicitly allowing GPTBot, ClaudeBot, anthropic-ai, Google-Extended, PerplexityBot, CCBot, Applebot-Extended, and other AI crawlers. Sites that block AI crawlers by accident are not unusual; we fix that.
Versioning and maintenancellms.txt is committed alongside your site code, gets a version dated in a comment header, and is updated whenever services change. Quarterly review is included in any maintenance retainer.

What this service includes

Site taxonomy review and section design
Hand-authored llms.txt at /llms.txt
Optional llms-full.txt for content-heavy sites
AI-aware robots.txt with explicit crawler allows

Versioned, comment-headered, parseable file
Markdown validation against the llms.txt spec
Linked-page reachability and 200-status check
Quarterly review included in maintenance retainers

llms.txt vs. sitemap.xml vs. robots.txt

Three discovery files, three different jobs.
	robots.txt	sitemap.xml	llms.txt
Audience	All crawlers	Search-engine crawlers	AI engines and agents
Format	Plain text rules	XML	Markdown
Purpose	Access rules	URL inventory	Curated overview
Curation	None	Comprehensive (every URL)	Highly curated (priority URLs)
Adoption (2025)	Universal	Universal	Growing

Engagement example

A specialty B2B services firm had no llms.txt and a robots.txt that (by oversight) blocked GPTBot and ClaudeBot, which meant the site was effectively invisible to those crawlers' direct access. We hand-authored a 95-line llms.txt covering their service taxonomy, fixed the robots.txt to explicitly allow major AI crawlers, and added a quarterly-review note to their maintenance retainer.

1Hand-authored llms.txt at the root

~15AI crawlers now explicitly allowed

2Crawlers unblocked (was blocked by accident)

Representative engagement. Client identity withheld for privacy.

Frequently asked questions

llms.txt is a proposed standard for a markdown-formatted file at the root of a domain (/llms.txt) that gives AI systems a curated, structured overview of the site's most important content. Analogous to robots.txt or sitemap.xml, but designed to be read by large language models. It was proposed by Jeremy Howard in late 2024 and has been adopted by a growing number of major sites since.

It's evolving. Some AI crawlers and tools explicitly look for llms.txt; others don't yet. The strategic argument for shipping one anyway is that the file costs almost nothing to maintain, may be read directly by AI tools, and absolutely gets read by AI agents users point at your site for due-diligence research. It's a low-cost asymmetric bet.

The standard defines two related files. llms.txt is the short, navigable index. H1 site name, blockquote tagline, and curated link sections to the most important pages. llms-full.txt is the long-form version that may include the actual content of those pages in markdown form. We typically deliver llms.txt and recommend llms-full.txt only for content-heavy sites where it adds real value.

sitemap.xml is a comprehensive list of every URL the site wants crawled. robots.txt is a set of access rules. llms.txt is a curated, hierarchically-organized markdown overview of the site's most important content, with descriptions. Essentially "here's what this site is about and where the canonical pages live, in a format an AI can read directly".

Hand-authored, almost always. The whole point of llms.txt is curation: which pages matter, in what order, with what one-line descriptions. An auto-generated version that lists every URL defeats the purpose. We hand-author the file from your service taxonomy, version it, and update it as services change.

No llms.txt yet? Want to ship one this month?

Send your URL. We'll hand-author an llms.txt against your service taxonomy and ship it alongside an updated robots.txt. Typically inside two weeks.

Start a Project

LLMs.txt Optimization for AI Discovery

Overview

What is llms.txt?

How we work

What this service includes

llms.txt vs. sitemap.xml vs. robots.txt

Engagement example

Related services

Frequently asked questions

What is llms.txt?

Do AI engines actually read llms.txt today?

What's the difference between llms.txt and llms-full.txt?

How is llms.txt different from sitemap.xml or robots.txt?

Should the llms.txt be auto-generated or hand-authored?

No llms.txt yet? Want to ship one this month?