pyssg is pre-1.0 and under active development - APIs, config, and themes may change.
pyssg.contrib.llms

2 min read

pyssg.contrib.llms

Contrib plugin: llms.txt / llms-full.txt output.

Emits an AI-consumable index of the site following the llms.txt <https://llmstxt.org/>_ convention:

  • /llms.txt -- a Markdown index: an # H1 site title, a > blockquote summary, then one ## Section per top-level URL segment listing each page as - [title](absolute-url): excerpt.
  • /llms-full.txt -- the selected pages' Markdown bodies concatenated into a single document (separated by ---), so an agent can ingest the whole site in one fetch.
  • markdown_pages=True (opt-in) additionally emits a raw .md next to every page (/reference/cli/ -> /reference/cli.md), so an agent can fetch the Markdown of a single page directly.

Relative .md links inside the bodies are resolved to absolute URLs (the bodies are pre-resolution Markdown, so a raw foo.md link would otherwise 404 on a site that serves clean URLs). When markdown_pages is on, every link -- in the index, the full file and the per-page files -- targets the .md of the target, so an agent can crawl the whole site as Markdown; otherwise links target the clean HTML page URLs.

Honest positioning: the value is for IDE agents (Cursor/Cline/Aider) and MCP doc servers that ingest a site as context -- not "SEO for AI". Prior art: mkdocs-llmstxt.

Like the sitemap/rss plugins this is a summarizer fan-in: it taps evaluate_collections (after nav/taxonomy so every virtual page already exists), scans the final graph for document-backed pages, and materializes one or two virtual pages carrying the rendered text as content_html with template=None (the render contract for "emit verbatim, no layout"). It reads only declared inputs -- page urls, document meta and the Markdown body kept on the node (__body__), never the clock or the filesystem -- and sorts deterministically, so two builds are byte-identical and an incremental rebuild matches a full one.

Selection: pages are grouped/filtered by section (the first URL segment). include keeps only the listed sections (None = all), exclude drops sections, and a document can opt out entirely with llms: false in its frontmatter. This plugin is stdlib only and pure; per the contrib rules it ships tests and is not auto re-exported into pyssg.plugins.

render_index(*, title: str, summary: str, entries: list[_Entry]) -> str

Render the /llms.txt Markdown index from selected entries.

render_full(entries: list[_Entry]) -> str

Render /llms-full.txt: each page's body under a title + source line.

build_llms(build: Build, *, include: tuple[str, ...] | None = None, exclude: tuple[str, ...] = (), full: bool = True, markdown_pages: bool = False, title: str | None = None, summary: str | None = None) -> None

Materialize the /llms.txt (and optional /llms-full.txt + per-page .md) pages.

class LlmsPlugin

Emits an llms.txt index (and optional llms-full.txt) of the site.

LlmsPlugin.apply(self, builder: Builder) -> None

llms(*, include: tuple[str, ...] | None = None, exclude: tuple[str, ...] = (), full: bool = True, markdown_pages: bool = False, title: str | None = None, summary: str | None = None) -> LlmsPlugin

Factory used in pyssg.config.py.