Make the site discoverable to LLM crawlers
When LLM crawlers and agents should ingest the site without scraping HTML, the /llms.txt index plus per-page stripped-markdown sidecars give them a direct surface. On AddDocSite hosts the wiring is automatic and output is controlled through front matter; on bare AddPennington hosts a single AddLlmsTxt(...) call enables it. If no site exists yet, start with Your first Pennington site.
Assumptions
- A working Pennington site (see Your first Pennington site if not)
- The chosen host extension —
AddDocSitevs bareAddPennington— and the reason for that choice (see When is DocSite the right starting point?)
For a working DocSite setup with one opted-out page, refer to Content/main/llms-hidden.md in examples/DocSiteKitchenSinkExample.
Options
Decide: DocSite front matter, or bare AddLlmsTxt?
AddDocSite already calls AddLlmsTxt internally and defaults ContentSelector to #main-content. On a DocSite host, per-page inclusion is controlled through front matter (below), with an optional selector override through DocSiteOptions.LlmsTxtContentSelector. On a bare AddPennington host nothing is wired — AddLlmsTxt(...) needs an explicit call. Markdown content always renders through the engine's rendition channel; the selector applies only to the HTTP-fetch fallback used for Razor pages and API symbol pages, where it scopes the live HTML to the article and strips layout chrome.
(DocSite) Opt a page out with llms: false
Every non-draft page is included in the index by default (Llms = true). Setting llms: false in a page's front matter causes LlmsTxtService to skip it when assembling /llms.txt and its sidecar markdown. The page still renders, appears in the sidebar, and participates in search unless search: false is also set.
---
title: Not in llms.txt
description: This page is intentionally excluded from llms.txt.
sectionLabel: authoring
order: 230
llms: false
uid: kitchen-sink.main.llms-hidden
---
# Not in llms.txt
This page carries `llms: false` in its front matter. It still appears
in the sidebar and in search results, but `/llms.txt` does **not**
list it. The content-stripping llms generator skips pages whose
`Llms` flag is `false` when assembling its index of documents.
/// <summary>When false, the page is excluded from the generated llms.txt output.</summary>
public bool Llms { get; init; } = true;
For a custom ContentSelector (different article wrapper or a non-DocSite layout), set DocSiteOptions.LlmsTxtContentSelector. It defaults to #main-content and is overridable without leaving DocSite. See When is DocSite the right starting point? for cases that do require bare AddPennington.
Split content per-fragment with humans-only / robots-only
For finer control than page-level opt-out, two paired classes mark a fragment as intended for one audience or the other. Both ship as part of the MonorailCSS base styles, so no registration is needed.
humans-only— visible in the browser, stripped from the llms.txt extraction. Reach for it when a widget, interactive demo, or layout flourish carries no information an LLM needs.robots-only— hidden in the browser viadisplay: none, kept in the llms.txt extraction. Reach for it when an LLM needs context the human reader already has visually (a full signature next to a compact hover card, prose that mirrors a diagram, etc.).
<div class="humans-only">
<InteractiveTour />
</div>
<div class="robots-only">
<p>Full method signature: <code>Task<Result> ProcessAsync(Options options, CancellationToken ct = default)</code>.</p>
</div>
The classes work anywhere in the rendered page — markdown bodies, Razor components, auto-generated reference pages.
(Bare Pennington) Enable LlmsTxtOptions with AddLlmsTxt
On a bare host nothing is wired until penn.AddLlmsTxt(...) is called. The options surface covers OutputDirectory (where per-page sidecars land, defaults to _llms), GenerateFullFile (emit a concatenated /llms-full.txt for one-shot ingest), and ContentSelector (CSS selector applied to the HTTP-fetched HTML for non-markdown content; null means the whole <body>).
opts.OutputDirectory = "_llms";
opts.ContentSelector = "article";
opts.GenerateFullFile = false;
/// <summary>Configuration for llms.txt generation.</summary>
public sealed class LlmsTxtOptions
{
/// <summary>Output directory for raw markdown files (relative to site root). Default: "_llms".</summary>
public string OutputDirectory { get; set; } = "_llms";
/// <summary>Whether to also generate llms-full.txt with all content concatenated.</summary>
public bool GenerateFullFile { get; set; }
/// <summary>
/// CSS selector used to scope the HTML-to-markdown conversion when a page is fetched
/// over HTTP for the LLM channel. Markdown-source pages render via the rendition
/// channel and ignore this setting — this only applies to Razor pages and other
/// non-markdown content where the LlmsTxtService falls back to fetching the live
/// rendered HTML and stripping the layout chrome.
/// <para>
/// Default <see langword="null"/> means the whole <c><body></c> is used.
/// Hosts with a layout shell (e.g. DocSite's <c>#main-content</c>) should set this
/// so navigation, footers, and other chrome don't bleed into the LLM sidecars.
/// </para>
/// </summary>
public string? ContentSelector { get; set; }
}
(Optional) Turn on GenerateFullFile for a concatenated snapshot
GenerateFullFile = true emits /llms-full.txt, the same per-page markdown concatenated into one file — useful for one-shot ingest by agents that cannot follow per-page links. The default is false because the full file can be large; enable it when a known consumer needs it.
/// <summary>Whether to also generate llms-full.txt with all content concatenated.</summary>
public bool GenerateFullFile { get; set; }
Result
/llms.txt lists each indexed page as a markdown link grouped by section, and each page gets a stripped-markdown sidecar at /_llms/<page>.md. Links are fully qualified when PenningtonOptions.CanonicalBaseUrl is set (or build --base-url https://… is passed); otherwise they fall back to root-relative /_llms/... so an agent that fetched /llms.txt can still resolve them against the origin.
A typical excerpt:
# Pennington Docs
> Content engine library for .NET.
## Tutorials
- [Your first Pennington site](https://docs.example.com/tutorials/getting-started/first-site): Build a static site from a single markdown file.
- [Add a second locale](https://docs.example.com/tutorials/beyond-basics/add-a-locale): Ship the same content in a second language.
## How-to
- [Switch the body and heading typeface](https://docs.example.com/how-to/configuration/fonts): Self-host woff2, declare preloads, point the family options at the new faces.
Verify
- Run
dotnet runand fetch/llms.txt. Expect a metadata block, a## Mapsection listing per-subtree splits with token estimates, then nav-grouped links — and no line for any page markedllms: false - Fetch one per-page sidecar (for example,
/_llms/<page>.md). Expect a YAML header withcanonical_url,content_hash,tokens, and the body stripped to clean markdown - With
GenerateFullFile = true, fetch/llms-full.txt. Expect every sidecar concatenated in one response
Related
- Reference:
LlmsTxtOptions - How-to: Tune what the search box returns
- Background: When is DocSite the right starting point?