feat: improve AI agent discoverability by harsh62 · Pull Request #8607 · aws-amplify/docs

harsh62 · 2026-06-29T20:32:47Z

Summary

Improves how AI agents and crawlers discover and consume the Amplify docs, implemented entirely within what the static export + Amplify Hosting can serve today. These changes came out of running the site through an "agent readiness" scan and fixing every gap that has a legitimate, non-misleading solution.

All additions point at content or capabilities that actually exist (the generated llms.txt/markdown exports, the real awslabs/agent-plugins skill, AWS's public managed MCP server, and read-only browser tools backed by real data) — no stub endpoints or fabricated capabilities.

What's included

Discoverability

Content Signals in robots.txt — Content-Signal: search=yes, ai-input=yes, ai-train=yes.
Link response headers (customHttp.yml, RFC 8288) advertising the API catalog, agent skills index, MCP server card, and the llms.txt index.

Agent discovery files (generated at build time, alongside robots.txt/sitemap.xml)

/.well-known/api-catalog (RFC 9727 linkset, application/linkset+json) with the required relations mapped to the artifacts the build produces: service-desc → llms-full.txt, service-doc → llms.txt, service-meta → sitemap.xml.
/.well-known/agent-skills/index.json (Agent Skills Discovery RFC v0.2.0) advertising the real amplify-workflow skill from awslabs/agent-plugins. Entries point at the docs/marketplace install page, so no sha256 is published (the page is install guidance, not a downloadable artifact).
/.well-known/mcp/server-card.json describing the public, no-auth AWS Knowledge MCP Server (https://knowledge-mcp.global.api.aws), which authoritatively indexes Amplify docs. An honest pointer to AWS's managed server — the card explicitly states this site does not host its own MCP server.

Markdown vending (the per-page .md files are already generated under /ai/pages/**)

Per-page autodiscovery — each Gen2 content page's <head> emits <link rel="alternate" type="text/markdown" href="/ai/pages/….md">, reusing MarkdownMenu's getMarkdownUrl mapping and gate (skips gen1/home/overview pages that have no .md twin).
Correct media type — /ai/**/*.md served as text/markdown; charset=utf-8.

WebMCP (in-browser agent tools)

Registers read-only tools via document.modelContext (with navigator.modelContext fallback) so in-browser AI agents can call the site's key read actions:
- get_current_page_markdown — returns the current page's generated Markdown twin
- get_documentation_index — returns the llms.txt index
Both are backed by content the build already produces (real data, not stubs), feature-detected (silent no-op without WebMCP), and torn down on unmount via AbortSignal.

Out of scope (intentionally)

Several scanned standards require live services / DNS / viewer-request edge compute that don't exist behind a static public docs site, and publishing files for them would mislead agents that trust them:

OAuth Protected Resource (/.well-known/oauth-protected-resource) and auth.md — both declare that the site's resources are access-controlled and tell agents how to obtain tokens to reach them. docs.amplify.aws is fully public with no protected API and no agent auth. The scanner passes on the metadata file alone, but asserting a protected resource that doesn't exist could make agents refuse to read public docs or attempt pointless token flows. Deliberately skipped.
Markdown content negotiation (same-URL Accept: text/markdown) — requires reading a request header at the edge. Amplify Hosting rewrites match on path/query only (confirmed by redirects.json's own validator), and the managed CloudFront distribution exposes no viewer-request function hook to this repo. The per-page <link rel="alternate"> + text/markdown content-type above is the static-friendly equivalent (agents get the markdown at a sibling URL). True same-URL negotiation needs a CloudFront Function on a fronting distribution, owned outside this repo.
OAuth/OIDC discovery, WebMCP for write actions, DNS-AID, Web Bot Auth, commerce protocols (x402/ACP/UCP/MPP) — no protected API, no site-side mutating actions, DNS/key infrastructure owned elsewhere, and not an e-commerce site.

Testing

New unit tests for the API catalog (incl. RFC 9727 service-desc/service-meta structure), MCP server card, agent skills index, and the WebMCP component (no-op without API, tool registration, real fetch on execute).
Full unit suite passes (291 tests); tsc --noEmit clean on changed components.
Verified generated output for robots.txt, api-catalog, server-card.json, and agent-skills/index.json end-to-end.

Add agent-readiness signals that the static docs build can legitimately serve: - robots.txt: add Content-Signal directive (search/ai-input/ai-train=yes) declaring AI content-usage preferences (contentsignals.org) - customHttp.yml: add RFC 8288 Link headers advertising the API catalog and the existing llms.txt documentation index; set linkset media type - /.well-known/api-catalog: new RFC 9727 linkset pointing agents at the llms.txt / llms-full.txt exports and sitemap (generated at build time, mirroring how robots.txt and sitemap.xml are emitted in postBuildTasks) - add unit tests for the API catalog generator

Add /.well-known/agent-skills/index.json (Agent Skills Discovery RFC v0.2.0) advertising the real amplify-workflow skill from awslabs/agent-plugins. - generate-agent-skills.mjs: build-time generator emitting the index, sourcing name/description from the upstream SKILL.md frontmatter; url points at the agent-plugins docs/marketplace install page, so no sha256 digest is published (the page is discovery/install guidance, not a downloadable artifact) - wire writeAgentSkillsIndex into postBuildTasks (mirrors robots/sitemap/catalog) - customHttp.yml: advertise the index via an additional Link relation - add unit tests for the index generator

Surface the real AWS Knowledge MCP server and make the generated per-page markdown twins discoverable and correctly typed. MCP server card: - generate-wellknown.mjs: emit /.well-known/mcp/server-card.json describing the public, no-auth AWS Knowledge MCP server (https://knowledge-mcp.global.api.aws, HTTP transport, tools) which authoritatively indexes Amplify docs. The card is an honest pointer to AWS's managed server, not a claim that docs.amplify.aws is itself an MCP endpoint. - wire writeMcpServerCard into postBuildTasks; advertise via a Link rel Markdown vending: - customHttp.yml: serve /ai/**/*.md as text/markdown; charset=utf-8 - Layout: inject <link rel="alternate" type="text/markdown"> into each Gen2 content page's <head> for automatic per-page discovery, reusing MarkdownMenu's getMarkdownUrl mapping (now exported) and mirroring its gate (skip gen1/home/ overview pages that have no .md twin) - extend generate-wellknown tests for the server card

The api-catalog linkset entry was missing the required service-desc relation, so validators could not recognize a machine-readable service description. Map the relations per RFC 9727: - service-desc: llms-full.txt (complete machine-readable export) - service-doc: llms.txt (documentation index) - service-meta: sitemap.xml Each relation is now an array of { href, type } objects per Appendix A.

Register WebMCP tools via document.modelContext (with navigator.modelContext fallback) so in-browser AI agents can call the docs site's key read actions: - get_current_page_markdown: returns the current page's generated Markdown twin - get_documentation_index: returns the llms.txt documentation index Both tools are read-only and backed by content the build already produces, so they return real data rather than stubs. The API is feature-detected and the component renders nothing, making it a silent no-op in browsers without WebMCP. Tools are torn down on unmount via an AbortSignal. Mounted from Layout on the same Gen2 content pages that have a Markdown twin.

With trailingSlash: true, Amplify Hosting 301-redirects the extensionless path /.well-known/api-catalog to /.well-known/api-catalog/, which has no file and returns 404 -- so the RFC 9727 catalog was unreachable at its canonical path. Files with an extension are served directly with a 200. - Write the catalog as api-catalog.json (extensioned, served as 200) - Add a 200-rewrite in redirects.json mapping /.well-known/api-catalog to that file so the canonical path resolves in place without a redirect - Set application/linkset+json on both paths in customHttp.yml - Add a test asserting the 200-rewrite contract

osama-rizk

Nice, well-scoped change — solid tests and an honest scope section. No crash-level bugs; inline comments below, most-impactful first. (Checked and NOT flagging: redirects.json — the AJV validator accepts status: "200" as a string and the rule sits ahead of the /<*> catch-all, so it resolves 200 today; only nit is the test asserts existence, not ordering.)

osama-rizk · 2026-07-02T15:56:41Z

+    await fs.writeFile(catalogPath, generateApiCatalog());
+    console.log(`api-catalog written to ${catalogPath}`);
+  } catch (error) {
+    console.error(`Error writing api-catalog to ${catalogPath}:`, error);


Swallowed write error ships a green build that advertises 404s. This catch logs and returns, so a failed write still passes. Meanwhile customHttp.yml emits a global Link header to /.well-known/api-catalog, whose links point at llms-full.txt/llms.txt. If a generator throws, agents follow a Link header to a missing file. writeSitemap/writeRobots do the same, but they aren't advertised in a response header — the blast radius is new here. Consider failing the build on write error (same applies to generate-agent-skills.mjs).

osama-rizk · 2026-07-02T15:56:41Z

 }

-function getMarkdownUrl(route: string): string {
+export function getMarkdownUrl(route: string): string {


getMarkdownUrl doesn't strip the query string. usePathWithoutHash splits on # only, so /react/build-a-backend/auth/?foo=bar → /ai/pages/build-a-backend/auth/?foo=bar.md (404). This PR now routes this function into three consumers (<link rel="alternate">, the WebMcp fetch, and the copy/open menu), so one bad URL propagates everywhere.

osama-rizk · 2026-07-02T15:56:41Z

+      type: 'claude-skill',
+      description:
+        'Build and deploy full-stack web and mobile apps with AWS Amplify Gen2 (TypeScript code-first). Covers auth (Cognito), data (AppSync/DynamoDB), storage (S3), functions, APIs, and AI (Amplify AI Kit with Bedrock) across React, Next.js, Vue, Angular, React Native, Flutter, Swift, and Android.',
+      url: `${domain}/react/develop-with-ai/agent-plugins/`


Skill URL hardcodes /react/ for a platform-agnostic page. The discovery index is global; a docs restructure off /react/ silently publishes a 404 to agents with no error anywhere. Use a platform-neutral/canonical path.

osama-rizk · 2026-07-02T15:56:41Z

@@ -170,6 +171,14 @@ export const Layout = ({
    children?.props?.childPageNodes?.length != 'undefined' &&


isOverview guard is inert. children?.props?.childPageNodes?.length != 'undefined' compares a number to the string "undefined" — always true (meant typeof … !== 'undefined'). It works today only because the > 0 clause carries the whole predicate. Pre-existing, but this PR now depends on isOverview to gate markdownUrl, so it re-exposes it.

osama-rizk · 2026-07-02T15:56:41Z

+ * Fetch a markdown document and return its text, guarding against the SPA
+ * fallback returning an HTML page (e.g. a 404) instead of markdown.
+ */
+async function fetchMarkdown(url: string): Promise<string> {


fetchMarkdown duplicates MarkdownMenu.handleCopy. Both fetch a /ai/pages/*.md URL and reject the SPA HTML fallback with the identical regex pair (/^\s*<!doctype/i, /^\s*<html/i). Fix the fallback detection in one and the other rots. Extract a shared fetchPageMarkdown next to getMarkdownUrl — you already made that move for getMarkdownUrl.

osama-rizk · 2026-07-02T15:56:41Z

+
+dotenv.config({ path: './.env.custom' });
+
+const DOMAIN = process.env.SITEMAP_DOMAIN


DOMAIN + ROOT_PATH are copy-pasted across three task files (generate-sitemap, generate-wellknown, generate-agent-skills). Change the output dir or default domain and you edit three files in lockstep. Extract a shared tasks/build-constants.mjs.

osama-rizk · 2026-07-02T15:56:41Z

+
+    const register = async () => {
+      try {
+        await modelContext.registerTool(


Both registerTool calls share one try. If the first rejects (e.g. a transient duplicate-name error during the abort/re-register on fast client-side nav — both names are route-independent), the second tool never registers and the catch swallows it, leaving the page with one or zero tools. Independent trys per tool isolate them.

osama-rizk · 2026-07-02T15:56:41Z

+      # Link headers advertise agent-discovery resources (RFC 8288 / RFC 9727):
+      # the API catalog, the agent skills index, the MCP server card, and the
+      # LLM-friendly documentation index.
+      - key: 'Link'


The Link header sits on the global **/* block, so it rides every response (HTML, images, JSON), not just discovery routes — bytes on every request and a wider blast radius for the missing-file cases above. Worth a conscious choice vs. scoping it to the relevant paths.

- getMarkdownUrl: strip query string and hash before building the .md URL, so all three consumers (link rel=alternate, WebMcp, copy menu) get a valid URL for routes with ?query or #hash - generators: rethrow write errors in the api-catalog, MCP server card, and agent-skills writers so a failed write fails the build instead of shipping a green build whose global Link header advertises a missing file - WebMcp: register each tool in its own try so one rejected registration can't block the others; reuse the shared fetchPageMarkdown helper - MarkdownMenu: extract shared fetchPageMarkdown (used by copy menu and WebMcp) so the SPA-HTML fallback guard lives in one place - Layout: fix inert isOverview guard (typeof x !== 'undefined', not x != 'undefined') - tasks: extract shared build-constants.mjs (DOMAIN, ROOT_PATH) and a CANONICAL_PLATFORM constant used for the agent-skills URL - customHttp.yml: document the deliberate choice to keep the Link header on the global block (Amplify patterns are positive-match only; trailingSlash makes pages extensionless, so an html-only pattern would miss real page loads) - tests: query/hash stripping, fetchPageMarkdown, isolated tool registration, and redirect-ordering coverage

bobbor

LGTM.

only smaller nits that are not blocking. we can go forward with this

mergify · 2026-07-03T09:39:13Z

Tick the box to add this pull request to the merge queue (same as @mergifyio queue).

Queue this pull request

harsh62 added 3 commits June 29, 2026 10:02

harsh62 requested a review from a team as a code owner June 29, 2026 20:32

harsh62 added 4 commits June 29, 2026 17:11

chore: update yarn.lock to v10 format for Yarn 4.17

a73e0f2

osama-rizk reviewed Jul 2, 2026

View reviewed changes

osama-rizk approved these changes Jul 3, 2026

View reviewed changes

bobbor approved these changes Jul 3, 2026

View reviewed changes

harsh62 merged commit f077b6c into main Jul 3, 2026
13 checks passed

harsh62 deleted the agent-readiness branch July 3, 2026 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: improve AI agent discoverability#8607

feat: improve AI agent discoverability#8607
harsh62 merged 8 commits into
mainfrom
agent-readiness

harsh62 commented Jun 29, 2026 •

edited

Loading

Uh oh!

osama-rizk left a comment

Uh oh!

osama-rizk Jul 2, 2026

Uh oh!

osama-rizk Jul 2, 2026

Uh oh!

osama-rizk Jul 2, 2026

Uh oh!

osama-rizk Jul 2, 2026

Uh oh!

osama-rizk Jul 2, 2026

Uh oh!

osama-rizk Jul 2, 2026

Uh oh!

osama-rizk Jul 2, 2026

Uh oh!

osama-rizk Jul 2, 2026

Uh oh!

bobbor left a comment

Uh oh!

mergify Bot commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -170,6 +171,14 @@ export const Layout = ({
		children?.props?.childPageNodes?.length != 'undefined' &&


		dotenv.config({ path: './.env.custom' });

		const DOMAIN = process.env.SITEMAP_DOMAIN

Uh oh!

Conversation

harsh62 commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Out of scope (intentionally)

Testing

Uh oh!

osama-rizk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bobbor left a comment

Choose a reason for hiding this comment

Uh oh!

mergify Bot commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

harsh62 commented Jun 29, 2026 •

edited

Loading