Crawlability & Access Layer 1 · Access Emerging / Experimental

llms.txt Presence

llms.txt is a plain-text file published at the root of your domain (/llms.txt) that gives AI systems a curated, machine-readable map of your most important pages and the context around them — a proposed standard for helping large language models find, understand, and prioritize your authoritative content during retrieval.

Why it matters

Why this signal affects whether AI cites you.

When an AI system answers a question, it first has to find and retrieve the right pages — and most sites give it no help, leaving models to guess from sprawling navigation, JavaScript-rendered menus, and inconsistent internal links. llms.txt addresses that retrieval problem directly: it is a single, curated, Markdown-formatted file that tells AI systems which pages on your domain are authoritative and what each one covers. Think of it as a sitemap written for language models rather than search crawlers — concise, human-readable, and prioritized. The signal matters most for documentation-heavy sites, product catalogs, and knowledge bases where the highest-value content is buried several clicks deep. Because the format is still emerging and not yet consumed by every AI platform, llms.txt is best understood as a low-cost, forward-looking bet: it costs almost nothing to publish, it cannot hurt your traditional SEO, and it positions your most citable content for the retrieval-augmented systems that increasingly mediate how people find answers. Sites that publish a clean, well-scoped llms.txt make it dramatically easier for an AI to locate the exact page that answers a query — and the page an AI can find is the page an AI can cite. The cost of being missing is asymmetric: a competitor whose llms.txt surfaces the same answer first becomes the cited source, and citations compound over time into perceived authority that is hard to claw back.

What good looks like

How to get this right.

  • A root /llms.txt file listing your top product, pricing, and documentation URLs with a one-line description for each entry.
  • An expanded /llms-full.txt that inlines the full text of key pages so models can retrieve them without crawling every link.
  • Referencing llms.txt from robots.txt and your XML sitemap so AI crawlers discover it on their very first visit.
What to avoid

Common mistakes.

  • Publishing a bloated llms.txt that lists every URL on the site — defeating the point of curation and prioritization.
  • Letting it go stale so it points at pages that have moved, been renamed, or no longer exist.

You now know the signal. See your score.

This page covers what llms.txt Presence is and how to get it right. AIVZ measures it on your actual pages — across six AI platforms, weighted and prioritized against all 93 factors — and hands you the exact fixes in priority order.

Scan your site against all 93 factors
Common questions

Frequently asked.

Do AI systems actually read llms.txt yet?

Adoption is still emerging and not universal across every platform, which is why it carries an 'Emerging' confidence label. It is a low-cost, forward-looking signal: cheap to publish, harmless to SEO, and well-positioned for retrieval-augmented systems as adoption grows.

Is llms.txt a replacement for an XML sitemap?

No. A sitemap helps search crawlers enumerate every URL; llms.txt curates and prioritizes your most authoritative pages for language models in human-readable Markdown. They are complementary, not substitutes.

Sources

Related factors

Signals that work together.

Last updated · See changelog

See What AI Sees

The fastest way to evaluate fit.

The free scan is the canonical demo of what AIVZ does. Run a scan on your own site, your competitor's, or a prospect's.

Factor · llms.txt Presence