AI & LLMsUpdated April 27, 2026

llms.txt

Q: Is llms.txt the same as robots.txt?

No. robots.txt controls *whether* crawlers can access content. llms.txt advertises *which* content is most worth using. They serve complementary purposes — robots.txt is restrictive, llms.txt is curatorial. Most sites should have both.

Definition

llms.txt is a proposed web standard — a markdown-formatted file placed at the root of a website — that gives LLMs and AI tools a curated index of a site's most important content. Modeled on robots.txt and sitemap.xml but designed for LLM comprehension rather than search crawlers, llms.txt is in the early adoption phase as of 2026, with no major AI platform officially committed to consuming it.

What it is

llms.txt is a single markdown file served at /llms.txt on a website's root domain. Its purpose is to give large language models and AI-powered tools a concise, machine-friendly map of the site's most important pages — the documentation, reference content, and canonical resources that should be cited when AI systems answer questions about the site or its category.

The standard was proposed by Jeremy Howard (founder of fast.ai and Answer.AI) in 2024. It is not maintained by W3C or IETF — it is a community-driven proposal that has gained traction among developer tooling and SaaS companies. By April 2026, llms.txt files have been published by Anthropic, Stripe, Zapier, Cloudflare, and a long list of developer-platform companies.

How to implement it

The format is intentionally simple. A valid llms.txt file starts with an H1 containing the site or product name, followed by a blockquote summary, then markdown sections grouping links to canonical pages.

A minimal example:

> AI search visibility and GEO platform that tracks brand mentions > across ChatGPT, Claude, Perplexity, and Google AI Overviews.
## Documentation
- [Getting started](https://indexly.ai/docs/getting-started.md): How
  to set up your first prompt-tracking project.
- [API reference](https://indexly.ai/docs/api.md): Full reference
  for the Indexly API.

## Glossary
- [Generative engine optimization](https://indexly.ai/glossary/generative-engine-optimization.md) - [Share of model](https://indexly.ai/glossary/share-of-model.md) ```

Optional pieces:

1. **Companion `.md` versions of pages.** llms.txt entries conventionally link to markdown versions of HTML pages (just append `.md` to the URL). This gives LLMs a clean, fully-rendered version of the content without HTML chrome.

2. **An `llms-full.txt` companion file.** A single file containing the complete markdown content of every linked page concatenated together. Useful for AI tools that prefer one fetch over many.

3. **Sectioning by audience.** "Documentation," "Examples," "API Reference," "Optional" are the conventional section headings — but any markdown heading structure works.

2024

Year Jeremy Howard proposed the llms.txt standard

llmstxt.org

April 2026

Date by which Anthropic, Stripe, Zapier, and Cloudflare had published llms.txt files

Industry analysis, 2026

Zero

Number of major AI platforms officially committed to consuming llms.txt as a primary input

Industry analysis, 2026

Common mistakes

The four most common implementation errors:

Treating it like robots.txt. llms.txt is additive, not restrictive. It tells LLMs where the canonical content is; it does not prevent crawling. Use robots.txt to control crawler access.

Linking to HTML pages without markdown alternatives. The whole value of llms.txt is delivering clean machine-readable content. If every link points to an HTML page wrapped in navigation chrome and JavaScript, the file does not solve the problem it was designed for.

Listing every page on the site. llms.txt is a curated index, not a sitemap. Listing 500 pages dilutes the signal. Most sites need 10–50 carefully-chosen entries covering the most cite-worthy content.

Not maintaining it. A stale llms.txt with broken links, removed pages, and outdated descriptions actively harms credibility. Treat it like documentation, not a one-time asset.

How to validate it

Three checks before publishing:

Syntax validation. The llms.txt format is just markdown — any markdown linter catches structural errors. Several community-built validators exist (search "llms.txt validator") that check for the conventional H1-blockquote-sections structure.

Link integrity. Every link in the file should resolve. Run a link checker quarterly to catch drift as your site evolves.

Adoption status check. Be honest with stakeholders about what llms.txt does and does not do today. Major AI platforms have not publicly confirmed they consume llms.txt as a first-class input. Publishing is forward-compatible — it positions you for when consumption matures — but it is not currently a measurable ranking input.

Frequently asked questions

Is llms.txt the same as robots.txt?

No. robots.txt controls whether crawlers can access content. llms.txt advertises which content is most worth using. They serve complementary purposes — robots.txt is restrictive, llms.txt is curatorial. Most sites should have both.

Do AI platforms actually use llms.txt?

Not officially as of April 2026. No major AI platform — OpenAI, Anthropic, Google, Perplexity — has publicly committed to reading llms.txt as a primary input. Community-built tools and retrieval pipelines can be configured to fetch it, but consumption is opt-in and inconsistent. Publishing is forward-compatible rather than currently-rewarded.

Should I publish an llms.txt file anyway?

Yes, if you maintain documentation, reference content, or other cite-worthy resources. The cost is low (minutes to create, hours to maintain) and the upside is meaningful if adoption matures. The same content also benefits human visitors who reach the file directly. Skip it if your site is primarily marketing pages with little reference-grade content.

How is llms.txt different from sitemap.xml?

sitemap.xml is exhaustive and machine-readable in a structured XML format aimed at search crawlers. llms.txt is curated and human-readable in markdown aimed at LLMs. sitemap.xml says "here is every page on the site"; llms.txt says "here are the pages most worth citing."

What goes in llms-full.txt?

llms-full.txt is the optional companion file containing the full markdown content of every page referenced in llms.txt, concatenated into a single file. It exists for AI tools that prefer one fetch over many — at the cost of a much larger file size. Not all sites that publish llms.txt also publish llms-full.txt.

Schema markup

Schema markup is structured data added to web pages using the schema.org vocabulary that tells search engines and AI systems exactly what the content represents — a product, an article, a recipe, an FAQ, a person. It powers rich results in Google, drives entity understanding in knowledge graphs, and increasingly determines whether content is cited in AI Overviews and LLM-generated answers.