Publish a sitemap
When crawlers need a canonical URL list to ingest the site, /sitemap.xml is registered and served automatically on any AddPennington-based host — a working site already emits one. The knobs below give crawlers absolute <loc> values, confirm that drafts and redirects are excluded, or turn off sitemap generation on a BlogSite host. For a first site, start with Create your first Pennington site.
Assumptions
- A working Pennington site (see Create your first Pennington site if not)
- Pages using an
IFrontMatterimplementation —DocFrontMatter,BlogFrontMatter, or a custom one — soIsDraftand (optionally)Dateflow through to the sitemap builder - A known publishing target: either a fully-qualified URL (set
CanonicalBaseUrl) or a sub-path viadotnet run -- build /sub/(the sitemap falls back toOutputOptions.BaseUrl)
Options
Confirm /sitemap.xml is already wired
AddPennington registers SitemapService and UsePennington maps GET /sitemap.xml to it. There is no AddSitemap(...) call to make and no toggle on PenningtonOptions. The service walks every registered IContentService.DiscoverAsync result, skipping non-HTML outputs and RedirectSource placeholders before the builder applies its own filters.
Set CanonicalBaseUrl so <loc> values resolve
When CanonicalBaseUrl is set on PenningtonOptions, DocSiteOptions, or BlogSiteOptions, the sitemap builder prefixes every URL with it — typically https://your-domain.com/ — producing the absolute <loc> entries crawlers require. When it is not set and the static build targets a sub-path (dotnet run -- build /sub/), the builder falls back to OutputOptions.BaseUrl, producing entries like /sub/page/. Crawlers can resolve those relative to the sitemap URL, but fully-qualified values are preferred.
new()
{
SiteTitle = "Pennington Kitchen Sink Blog",
Description = "A kitchen-sink BlogSite example that backs two how-to pages and two reference pages.",
CanonicalBaseUrl = "https://blog.example.com",
AuthorName = "Jamie Rivers",
AuthorBio = "Writing about content engines, static sites, and the tools in between.",
// Explicit for teaching value even though both default to true.
EnableRss = true,
EnableSitemap = true,
HeroContent = BuildHero(),
MyWork = BuildMyWork(),
Socials = BuildSocials(),
MainSiteLinks = BuildMainSiteLinks(),
}
See Pennington.BlogSite.BlogSiteOptions for the backing CanonicalBaseUrl property.
Use IsDraft and redirectUrl: to exclude pages
SitemapBuilder.Build drops any candidate whose front matter has isDraft: true and drops any candidate whose front matter implements IRedirectable with a non-empty RedirectUrl; redirect stubs are never listed as canonical URLs. search: false and llms: false are not honored here. Those are search-UX preferences, not SEO directives, so opting a page out of client-side search does not remove it from the sitemap.
var builder = ImmutableList.CreateBuilder<SitemapEntry>();
foreach (var candidate in candidates)
{
if (candidate.Metadata is { IsDraft: true })
continue;
// Redirects have no sitemap meaning — they aren't canonical URLs.
if (candidate.Metadata is IRedirectable { RedirectUrl: { Length: > 0 } })
continue;
var absoluteUrl = candidate.Route.AbsoluteUrl(_canonicalBase);
var lastModified = candidate.Metadata?.Date;
builder.Add(new SitemapEntry(
Url: absoluteUrl,
LastModified: lastModified,
ChangeFrequency: null,
Priority: null
));
}
return builder.ToImmutable();
The two front-matter members that drive the filter are IFrontMatter.IsDraft and IRedirectable.RedirectUrl; see Pennington.FrontMatter.IFrontMatter.
(BlogSite only) Set EnableSitemap = false to turn it off
On an AddBlogSite host, BlogSiteOptions.EnableSitemap (default true) is the one knob that unregisters the /sitemap.xml endpoint. Set it to false when the host environment owns its own sitemap. On a bare AddPennington or AddDocSite host the endpoint is always mapped; there is no equivalent toggle because the sitemap has no per-request cost when nothing fetches it.
Result
/sitemap.xml returns a <urlset> with one <url> per non-draft, non-redirect page, with absolute <loc> values when CanonicalBaseUrl is set:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2024-01-15</lastmod>
</url>
<url>
<loc>https://example.com/how-to/configuration/sitemap/</loc>
<lastmod>2024-02-03</lastmod>
</url>
</urlset>
Verify
- Run
dotnet runand fetch/sitemap.xml. Expect a<urlset>document with one<url><loc>…</loc></url>per non-draft, non-redirect page - Mark a page
isDraft: trueor setredirectUrl:on it and refetch. That URL is absent from the<urlset> - Publish with
CanonicalBaseUrl = "https://example.com"and confirm every<loc>starts withhttps://example.com/. Omit it and rundotnet run -- build /sub/to see<loc>values start with/sub/
Related
- Reference:
SitemapService - How-to: Publish an RSS feed
- How-to: Configure redirects