Make the site discoverable to LLM crawlers

When LLM crawlers and agents should ingest the site without scraping HTML, the /llms.txt index plus per-page stripped-markdown sidecars give them a direct surface. On AddDocSite hosts the wiring is automatic and output is controlled through front matter; on bare AddPennington hosts a single AddLlmsTxt(...) call enables it. If no site exists yet, start with Your first Pennington site.

Assumptions

A working Pennington site (see Your first Pennington site if not)
The chosen host extension — AddDocSite vs bare AddPennington — and the reason for that choice (see When is DocSite the right starting point?)

For a working DocSite setup with one opted-out page, refer to Content/main/llms-hidden.md in examples/DocSiteKitchenSinkExample.

Options

Decide: DocSite front matter, or bare `AddLlmsTxt`?

AddDocSite already calls AddLlmsTxt internally and defaults ContentSelector to #main-content. On a DocSite host, per-page inclusion is controlled through front matter (below), with an optional selector override through DocSiteOptions.LlmsTxtContentSelector. On a bare AddPennington host nothing is wired — AddLlmsTxt(...) needs an explicit call. Markdown content always renders through the engine's rendition channel; the selector applies only to the HTTP-fetch fallback used for Razor pages and API symbol pages, where it scopes the live HTML to the article and strips layout chrome.

(DocSite) Opt a page out with `llms: false`

Every non-draft page is included in the index by default (Llms = true). Setting llms: false in a page's front matter causes LlmsTxtService to skip it when assembling /llms.txt and its sidecar markdown. The page still renders, appears in the sidebar, and participates in search unless search: false is also set.

---
title: Not in llms.txt
description: This page is intentionally excluded from llms.txt.
sectionLabel: authoring
order: 230
llms: false
uid: kitchen-sink.main.llms-hidden
---
  
# Not in llms.txt
  
This page carries `llms: false` in its front matter. It still appears
in the sidebar and in search results, but `/llms.txt` does **not**
list it. The content-stripping llms generator skips pages whose
`Llms` flag is `false` when assembling its index of documents.

/// <summary>When false, the page is excluded from the generated llms.txt output.</summary>
public bool Llms { get; init; } = true;

For a custom ContentSelector (different article wrapper or a non-DocSite layout), set DocSiteOptions.LlmsTxtContentSelector. It defaults to #main-content and is overridable without leaving DocSite. See When is DocSite the right starting point? for cases that do require bare AddPennington.

Split content per-fragment with `humans-only` / `robots-only`

For finer control than page-level opt-out, two paired classes mark a fragment as intended for one audience or the other. Both ship as part of the MonorailCSS base styles, so no registration is needed.

humans-only — visible in the browser, stripped from the llms.txt extraction. Reach for it when a widget, interactive demo, or layout flourish carries no information an LLM needs.
robots-only — hidden in the browser via display: none, kept in the llms.txt extraction. Reach for it when an LLM needs context the human reader already has visually (a full signature next to a compact hover card, prose that mirrors a diagram, etc.).

<div class="humans-only">
  <InteractiveTour />
</div>
  
<div class="robots-only">
  <p>Full method signature: <code>Task&lt;Result&gt; ProcessAsync(Options options, CancellationToken ct = default)</code>.</p>
</div>

The classes work anywhere in the rendered page — markdown bodies, Razor components, auto-generated reference pages.

(Bare Pennington) Enable `LlmsTxtOptions` with `AddLlmsTxt`

On a bare host nothing is wired until penn.AddLlmsTxt(...) is called. The options surface covers OutputDirectory (where per-page sidecars land, defaults to _llms), GenerateFullFile (emit a concatenated /llms-full.txt for one-shot ingest), and ContentSelector (CSS selector applied to the HTTP-fetched HTML for non-markdown content; null means the whole <body>).

opts.OutputDirectory = "_llms";
opts.ContentSelector = "article";
opts.GenerateFullFile = false;

/// <summary>Configuration for llms.txt generation.</summary>
public sealed class LlmsTxtOptions
{
    /// <summary>Output directory for raw markdown files (relative to site root). Default: "_llms".</summary>
    public string OutputDirectory { get; set; } = "_llms";
  
    /// <summary>Whether to also generate llms-full.txt with all content concatenated.</summary>
    public bool GenerateFullFile { get; set; }
  
    /// <summary>
    /// CSS selector used to scope the HTML-to-markdown conversion when a page is fetched
    /// over HTTP for the LLM channel. Markdown-source pages render via the rendition
    /// channel and ignore this setting — this only applies to Razor pages and other
    /// non-markdown content where the LlmsTxtService falls back to fetching the live
    /// rendered HTML and stripping the layout chrome.
    /// <para>
    /// Default <see langword="null"/> means the whole <c>&lt;body&gt;</c> is used.
    /// Hosts with a layout shell (e.g. DocSite's <c>#main-content</c>) should set this
    /// so navigation, footers, and other chrome don't bleed into the LLM sidecars.
    /// </para>
    /// </summary>
    public string? ContentSelector { get; set; }
}

(Optional) Turn on `GenerateFullFile` for a concatenated snapshot

GenerateFullFile = true emits /llms-full.txt, the same per-page markdown concatenated into one file — useful for one-shot ingest by agents that cannot follow per-page links. The default is false because the full file can be large; enable it when a known consumer needs it.

/// <summary>Whether to also generate llms-full.txt with all content concatenated.</summary>
public bool GenerateFullFile { get; set; }

Result

/llms.txt lists each indexed page as a markdown link grouped by section, and each page gets a stripped-markdown sidecar at /_llms/<page>.md. Links are fully qualified when PenningtonOptions.CanonicalBaseUrl is set (or build --base-url https://… is passed); otherwise they fall back to root-relative /_llms/... so an agent that fetched /llms.txt can still resolve them against the origin.

A typical excerpt:

# Pennington Docs
  
> Content engine library for .NET.
  
## Tutorials
  
- [Your first Pennington site](https://docs.example.com/tutorials/getting-started/first-site): Build a static site from a single markdown file.
- [Add a second locale](https://docs.example.com/tutorials/beyond-basics/add-a-locale): Ship the same content in a second language.
  
## How-to
  
- [Switch the body and heading typeface](https://docs.example.com/how-to/configuration/fonts): Self-host woff2, declare preloads, point the family options at the new faces.

Verify

Run dotnet run and fetch /llms.txt. Expect a metadata block, a ## Map section listing per-subtree splits with token estimates, then nav-grouped links — and no line for any page marked llms: false
Fetch one per-page sidecar (for example, /_llms/<page>.md). Expect a YAML header with canonical_url, content_hash, tokens, and the body stripped to clean markdown
With GenerateFullFile = true, fetch /llms-full.txt. Expect every sidecar concatenated in one response

Reference: LlmsTxtOptions
How-to: Tune what the search box returns
Background: When is DocSite the right starting point?

Assumptions

Options

Decide: DocSite front matter, or bare AddLlmsTxt?

(DocSite) Opt a page out with llms: false

Split content per-fragment with humans-only / robots-only

(Bare Pennington) Enable LlmsTxtOptions with AddLlmsTxt

(Optional) Turn on GenerateFullFile for a concatenated snapshot