Skip to content
Made For Builders iconoMade For Builders

DefinedTerm · Glossary

What is llms.txt

llms.txt is a plain-text file placed at a website's root that provides large language models with a structured, markdown-formatted index of a site's most important content. Proposed by Jeremy Howard in September 2024 as an emerging standard, it had been adopted by 844,000 sites by November 2025 according to BuiltWith — yet a 10-week experiment by Search Engine Land found no detectable crawling of the file by any of the four major AI engines.

edu-lopez-paradaPublicado Actualizado

Full definition

llms.txt is a plain-text file, written in markdown, that a website owner places at the root of their domain (e.g. https://example.com/llms.txt). Its purpose is to give large language models (LLMs) a curated, human-authored index of the site's most authoritative or training-relevant content, guiding model behaviour during inference or training rather than during web crawling.

The format was proposed by Jeremy Howard, co-founder of fast.ai, in September 2024 as an informal standard modelled loosely on the simplicity of robots.txt. A conformant llms.txt file begins with an H1 heading naming the site, followed by a short descriptive paragraph. Optional sections then list thematic groups of resources, each linking to a publicly accessible markdown version of the relevant page.

Unlike robots.txt, llms.txt carries no enforcement mechanism: it cannot instruct a bot to stay away from any URL, nor does any browser or search engine enforce its directives. It is, at its core, a voluntary editorial signal.

Why it matters in 2026

The rapid adoption figure — 844,000 sites tracked by BuiltWith as of November 2025 — reflects genuine enthusiasm from the developer and SEO communities. However, adoption should be weighed carefully against empirical evidence of impact.

Search Engine Land published findings from a 10-week experiment (November 2025) that served llms.txt on 50 live sites. None of the four major AI engines — ChatGPT, Perplexity, Claude, and Gemini — made any detectable HTTP requests to the file during the observation period.

An Ahrefs study by Linehan and Guan (75,000 brands, 12 December 2025) found that the presence of llms.txt correlated with AI visibility at a coefficient of 0.127 — the weakest individual signal in the top eight factors analysed. For context, the presence of a YouTube channel correlated at 0.737 in the same study.

The picture that emerges is one of a promising convention whose practical effect on AI citation remains unproven at scale. Implementing llms.txt is low-effort and carries no downside, but it should not be prioritised over higher-impact signals such as structured data, external citations, and authoritative content.

How it works

  1. Create a file named llms.txt and place it at the root of the domain.
  2. Open with an H1 containing the site or brand name.
  3. Add a concise paragraph (two to four sentences) describing what the site covers and for whom.
  4. Group related content into named sections using H2 headings. Under each heading, list markdown links to the most relevant pages.
  5. Ensure each linked resource is served as a publicly accessible markdown file (e.g. /about/index.md).
  6. Keep the file updated as major content changes occur.

Difference from robots.txt and sitemap.xml

FilePrimary functionRead byStatus in 2026
robots.txtAccess control for web crawlersAll bots via Robots Exclusion ProtocolMature, enforced standard
sitemap.xmlURL index for search engine indexingSearch engine crawlersMature, widely honoured standard
llms.txtCurated content index for LLMsLLMs (in principle)Emerging convention; active crawling unconfirmed

Related terms

Answer Engine Optimisation (AEO), Generative Engine Optimisation (GEO), AI Overviews.

Fuentes

Términos relacionados

  • aeo-answer-engine-optimization
  • geo-generative-engine-optimization
  • ai-overview