Understanding LLMs.txt and the Impact on Website SEO
Meet llms.txt: the new “robots.txt” for AI
Ever heard of robots.txt? It’s the old-school file that tells search engines which pages to crawl. Well, llms.txt (yes, plural L‑L‑M‑S) is like robots.txt’s quirky cousin, aimed squarely at AI.
llms.txt is “a plain text file that tells AI systems which URLs on your site you consider to be high-quality, LLM-friendly content”
Search Engine Land explains. i.e. the pages you want bots like ChatGPT, Claude or Gemini to ingest, understand and potentially cite. In other words, it’s a hand-crafted sitemap for language models.
The idea comes from data scientist Jeremy Howard (co-founder of Answer.AI) who in September 2024 proposed that websites add a /llms.txt file containing cleaned-up, plain-text content and links for AIs. Unlike a normal HTML page, this file is written in Markdown: simple headings
# Title
a summary block
> quote
and lists of links
[Link Title](URL): description
Its job is not to lock bots out (like robots.txt does) but to hold their hands and say, “Here’s where the juicy stuff is” effectively marking the treasure spots on your site for AI explorers.
Why should website owners care?
Because generative AI is changing the search game. Google, Bing and other engines are baking LLMs into their results (think Google’s AI Overviews, ChatGPT’s web citations etc.) and those models tend to pull from content that’s easy to ingest, easy to understand and easy to trust
A typical website (with nav menus, scripts and six layers of pages) can confuse an AI – it might never find your “golden nugget” of content if it lands on the wrong page. As one SEO expert put it
llms.txt lets you “plant flags” on those nuggets so the AI doesn’t wander your site lost at sea
Think of llms.txt as an inference-time GPS. It doesn’t change your content or memory (LLMs don’t actually ingest it into training); it just tells an AI, “When you visit now, start here”. That means more accurate, on-point answers for users – and more chance your content gets cited. For example, if a chatbot is answering a question about your industry (e.g., “Hey ChatGPT, find me a Bristol web design agency”) a well-crafted llms.txt can steer it to your blog or docs for data, boosting your visibility. It’s essentially GenAI-era SEO: some marketers call it Generative Engine Optimisation (GEO). In theory, guiding AIs in this way could improve your site’s visibility in AI-powered search results.
Of course, adoption is very early. Google’s John Mueller (Senior Webmaster Trends Analyst) recently said (candidly) that nobody at Google or OpenAI is using llms.txt yet – “you can tell by server logs” – and likened it to the old, ignored keywords meta tag. So it’s not a sure-fire ranking hack… yet. But given how fast AI features are rolling out, many site owners figure it’s worth an experiment. After all, this file is optional and harmless. If nothing else, it favours early adopters: as one analyst put it, in the absence of any other AI optimisation, “what have we got to lose?” by “marking the spot with a giant X” on our best content.
How to create and implement /llms.txt
Putting together an llms.txt file is (mostly) straightforward – it’s just a text file you upload to your site’s root (e.g. https://example.com/llms.txt), much like robots.txt
Here’s the basic recipe:
- File format: Markdown plain text. (Not HTML, not XML.)
- Location: Save it at the domain root as /llms.txt. (Some guides note that if your tool supports a sub-path, you can place it there but root is the norm)
- Filename: Exactly llms.txt (the plural) or it won’t be recognised
- Content structure: Start with a single H1 heading (#) naming your site or project (this is required). Then you may include a block-quote (>) with a short summary of what’s in the file. After that, you can add any plain paragraphs or bullet-point lists for context (though these are optional). Next come one or more H2 sections (##) with link lists. Under each H2, list the key pages or docs as markdown links in bullet form. For example:
# MySite.com Docs > Essential developer guides and reference materials We cover everything from getting started to advanced integration. ## Core Guides - [Quickstart Guide](https://mysite.com/docs/quickstart): Step-by-step setup. - [API Reference](https://mysite.com/docs/api): Detailed API docs. ## Optional Resources - [Release Notes](https://mysite.com/changelog): Version history (optional reading).
(Note: an “## Optional” section can be included for secondary links; AI tools may skip it if they need a shorter context)
Once you’ve created llms.txt, just upload it to your web host’s root directory. On WordPress sites, you can often drop it via FTP into public_html/, or use a plugin. For example, JetBrains’ Writerside docs tool can auto-generate llms.txt from your project and there are WP plugins (e.g. LLMs.txt & Full TXT Generator) that crawl your posts and produce the file automatically. In fact, the plugin has been downloaded over 3,000 times in its first few months showing plenty of interest. Our go-to Yoast now also offers this functionality however it is currently a premium feature.
SEO, licensing and AI-training considerations
Implementing llms.txt raises some big questions beyond just mechanics. Let’s unpack a few key points:
- Generative SEO: By guiding AIs to your best content, you’re effectively optimising for AI. In theory, this could mean higher visibility in AI-driven search features. If a chatbot is looking for an answer and your llms.txt pointed it at your FAQ, it might answer with your content. This is similar to SEO’s goal of getting crawlers to the right pages but for AI inference. DCOED calls it generative optimisation – not a ranking guarantee but potentially an edge if AI overviews become mainstream.
- Brand and citation control: You get a bit more say in how an AI sees your brand. If the AI scrapes your curated file, it’s more likely to cite you rather than misinterpreting or finding low-quality copies elsewhere. As one expert puts it, llms.txt can help with “brand reputation management” by giving businesses some control over how their info appears in AI-generated responses. (Think: fewer hallucinations about your products, because the AI has the correct data front-and-center.)
- Not a magic license: Crucially, llms.txt does not magically grant or revoke rights to your content. It’s purely guidance for inference-time. It won’t stop an AI from training on your site if it’s crawled (that’s governed by copyright law or other opt-out signals like robots.txt). In fact, even if you blocked training via other means, a public page is still accessible to an AI during a live chat as soon as it’s asked searchengineland.com . In practice, Google’s own SEO guru has downplayed llms.txt’s power – saying none of the big AI services even use it yet, likening it to the long-forgotten “keywords” meta tag. So if you have strict licensing needs, don’t rely on llms.txt alone; use proper copyright or robots policies instead.
- Competitive risk: Putting all your best content into a single file has a flip side: everyone can see it easily. Competitors or scrapers could use llms.txt to harvest your content or keywords with minimal effort. As one cautionary note observes, it “lowers the bar for your competitors to easily analyze what you have”. In other words, you’ve handed them a neatly organised copy of your site’s outline. So only list content you’re comfortable sharing publicly.
- Future outlook: Right now, adoption is spotty (aside from tech docs sites) but AI search is evolving fast. If llms.txt becomes a standard, early adopters might benefit from better AI citations and richer results. If not, the downside is just “effort sunk” and maybe some extra hits (or leeches). One thing’s for sure: generative search is coming and changing how visibility works. Keeping your site “AI-ready” with an llms.txt is like keeping a clean sitemap for Google – it’s forward-thinking housekeeping. As one commentator put it, in today’s AI-driven web “what have we got to lose?” by optimising ourselves for these new bots.
In summary llms.txt is a small change that could have big future payoff. It’s your chance to whisper the right URLs into the ears of AI assistants. Craft it with your best content, test how (if at all) AIs use it, and then revisit as the tech matures. For now, it’s not mandatory but if you run a content-heavy site (docs, tutorials, product info), it’s worth a shot to add that file – you’ll sleep at night knowing the AI has a treasure map to your best stuff and who knows where it might lead in 2025 and beyond.
Need help setting it up on your site? We can help. Or we can just chat AI ethics over a pint.
Sources & Further Reading: