Frameworks & teardowns · The GEO stack

llms.txt is not robots.txt for AI. It's something more interesting.

The file is widely described as 'robots.txt for AI.' That framing is wrong in an instructive way — robots.txt is a restriction, llms.txt is a declaration. Here's what it actually does, whether it works, and what to put in it.

27 June 2026 · 7 min read

The short version: llms.txt is a plain-text file you put at the root of your domain. It tells AI systems what your site is about and which pages deserve citation priority. It is not a crawl restriction — it's the opposite. It's a declaration of intent. Whether AI systems honour it today is an open question; whether you should add one is not.

The "robots.txt for AI" framing keeps appearing in LLM SEO write-ups and I understand why — robots.txt is the closest analogue most people have for a machine-readable signal at the root of a domain. But the analogy inverts the purpose. robots.txt says: don't crawl this. llms.txt says: here is what I am and what I want to be cited for. One is a restriction, the other is a pitch.

That distinction matters if you're thinking about AI visibility seriously, because conflating them leads to wrong conclusions about what the file can and can't do.

What llms.txt actually is

The standard was proposed in September 2024 by Jeremy Howard (fast.ai, answer.ai). The format is deliberately minimal:

  • A plain markdown file at yourdomain.com/llms.txt
  • An H1 with your site name
  • A short description of what the site is and who it's for
  • Bullet lists of key URLs with one-line descriptions, organised by priority

That's it. There's no schema to validate, no sitemap-style XML, no authentication layer. The format is readable by a human in thirty seconds and by an LLM in one pass. Here's a stripped-down example:

# Eitan Gorodetsky

> Founder and operator writing about AI adoption, AI visibility, and building with AI at the product level. Based in Australia, focused on AI-native operations.

## Key essays

- [What AI actually cites](/writing/what-ai-actually-cites): Original audit of 3,050 AI answers and 11,700 citations — which sources AI engines actually pull from and why.
- [AI-native marketing operations](/writing/ai-native-marketing-operation): How to rebuild a marketing function around AI rather than bolting it on.

## About

- [About](/about): Background, focus areas, contact.

The file above takes less time to write than the planning meeting about whether to write it.

The difference from robots.txt

robots.txt is a crawler instruction set. The semantics are negative: block this path, disallow this user-agent, don't index this section. It was designed to keep crawlers out of parts of your site you don't want indexed.

llms.txt runs the opposite direction. The semantics are positive: here are my most important pages, here is my primary claim, here is the order in which things matter. It's not a fence — it's a front door with a sign.

A second structural difference: robots.txt is widely honoured because the major search engines agreed to treat it as authoritative decades ago and the ecosystem enforced it. llms.txt has no such history yet. Which brings us to the honest part of this.

Does it actually work?

The honest answer in mid-2026 is: sometimes, for some systems, and the coverage is growing.

Perplexity has publicly stated they intend to use llms.txt when crawling. A handful of AI systems and agent frameworks have adopted it as a hint layer. Google's AI crawlers are evolving and the company has been quiet on whether they'll honour the format — AI Overviews uses its own signals, and those haven't been documented against this standard.

What we don't have is a controlled study showing measurable citation lift attributable specifically to adding llms.txt. Anyone telling you they measured that lift should share their methodology, because isolating one signal in the AI citation chain is genuinely hard. I haven't run that experiment. I've added llms.txt to the properties I manage and I treat it as a low-cost signal with asymmetric upside — not a guaranteed lever.

The case for adding it isn't "this definitely works." The case is: the cost is 15 minutes, the downside is nothing, and if adoption grows — which the trajectory of tooling suggests it will — you've pre-positioned your most citable content in a format those systems are explicitly looking for.

What to put in it

The most common mistake I see in llms.txt implementations is listing everything. That defeats the purpose. AI systems already crawl your content; what they lack is a clear signal about priority. Your llms.txt should answer one question: if an AI could only cite five pages from your site, which five would you most want it to cite?

The answer to that question is usually:

  • Your primary claim to authority (the page that most clearly states what you know and why)
  • Your highest-quality pillar content (the pages with the most original thinking or data)
  • Any FAQ or Q&A pages with structured factual content
  • Your about or expertise page (entity clarity for the model)
  • The specific topics you want your domain associated with in AI answers

Do not list thin pages, promotional pages, or pages that serve your conversion funnel but don't have independent informational value. An AI is not going to cite your pricing page. Listing it signals noise.

You can add a second file — llms-full.txt — with a more comprehensive index for crawlers that want the complete picture. The base file should stay ruthlessly prioritised.

Where llms.txt sits in the GEO stack

If you've been following the AI visibility conversation, you'll have noticed that the tactics list keeps expanding: structured data, entity signals, FAQPage schema, Wikidata presence, external citation building, forum participation, video content. llms.txt is one entry on that list. It's not a shortcut past the others.

The hierarchy, in rough order of impact on AI citation probability:

  1. Liftable, self-contained page content — the thing the AI actually quotes. No file at your domain root fixes a page that doesn't directly answer the question.
  2. FAQPage and structured schema — machine-readable signals that help AI systems parse your answer intent.
  3. External citation velocity — other credible sites linking to and citing you, which AI systems read as authority evidence.
  4. Entity clarity — consistent name, bio, and topic associations across Wikidata, Wikipedia where applicable, and your own About page.
  5. llms.txt — a priority signal and declaration of topical scope.

I've written about what AI actually cites at the page level — the audit data shows that ordinary web pages are 85–95% of what AI engines pull from, which means the content layer is where the game is won. llms.txt helps AI systems navigate your content once they've decided to crawl you. It doesn't make uncitable content citable.

The implementation

If you're on a static site or CMS with a public/ or static/ directory, add a llms.txt file there. If you're on Next.js, drop it in /public/llms.txt and it's served automatically at the root. Verify it's accessible at yourdomain.com/llms.txt with a curl or browser check.

Update it when your content priorities shift. If you publish a new pillar piece that becomes your best-performing page, add it. If an old page is no longer representative of your current thinking, remove it. Treat it as a living document, not a set-and-forget artefact.

The robots.txt analogy fails in one more way worth naming: robots.txt is purely defensive, it doesn't improve how you appear in search. llms.txt is actively promotional — it's you making a case to AI systems about what your domain knows and why they should cite it. That's closer to a PR pitch than a crawler directive. And like a PR pitch, it works better when it's specific, credible, and short.

Fifteen minutes. No downside. Do it.

If you're thinking more systematically about AI visibility — what drives citation, how to measure it, and where the real leverage sits — the audit I ran across 3,050 AI answers is the place to start. The signal in that data is not what most LLM SEO advice assumes.

Written by

Eitan Gorodetsky

I run an AI-native marketing operation, and write about what it takes to operate this way. Full story →

The newsletter

Get the teardown, then the next essay

Subscribe and I'll send you the teardown — the four layers of an AI-native marketing operation and the ladder to place your own function. Then one essay a month, written from inside the operation. No fluff, unsubscribe any time.

Double opt-in — you'll confirm by email. No spam, unsubscribe any time. See the privacy policy.