llms-txt-freshness: how should coverage handle intentionally excluded pages?

The `llms-txt-freshness` check compares `llms.txt` URLs against the sitemap to measure coverage, with thresholds at 95% (pass) and 80% (warn).

For our Cloudflare docs, I'm intentionally omitting certain pages that exist in our sitemap from `llms.txt` because they have no value for agents. For example, [https://developers.cloudflare.com/workers/reference/](https://developers.cloudflare.com/workers/databases/) is a directory listing page — it contains a few links to other pages but no substantive content. I believe it belongs in the sitemap (it's a real page on the site) but not in `llms.txt` (the content on this page - links to other pages - already exist in llms.txt).

<img width="770" height="599" alt="Image" src="https://github.com/user-attachments/assets/3fcca5d7-ba70-46bc-bba7-b5352eba2105" />

In our internal audits I'm currently working around this by just hardcoding lower thresholds (pass ≥ 75%, warn ≥ 60%) in a local patch, but I'm not sure what the right universal solution is. Does it make sense to treat every sitemap page as something that *should* be in llms.txt? If not, whats a reasonable and universal way to determine which ones should and shouldn't be included?

Open to discussion and whatever direction makes sense for the library.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llms-txt-freshness: how should coverage handle intentionally excluded pages? #46

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

llms-txt-freshness: how should coverage handle intentionally excluded pages? #46

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions