Skip to content

feat(mcp,skills): HTTP/OAuth transport, per-server filtering, unified skill registry, trigger auto-invoke#84

Merged
VickyXAI merged 4 commits into
mainfrom
feat/mcp-skills-refactor
Jun 14, 2026
Merged

feat(mcp,skills): HTTP/OAuth transport, per-server filtering, unified skill registry, trigger auto-invoke#84
VickyXAI merged 4 commits into
mainfrom
feat/mcp-skills-refactor

Conversation

@KillerQueen-Z

Copy link
Copy Markdown
Collaborator

Summary

Five long-standing gaps in Franklin's MCP + Skills layers, addressed in a single refactor:

Phase Item Where
P0 MCP HTTP/SSE transport + OAuth keyring-style storage src/mcp/client.ts, src/mcp/oauth.ts
P0 Unified skill Registry across bundled + learned + user + project src/skills/bootstrap.ts, src/learnings/store.ts
P1 per-server enabled_tools / disabled_tools filtering src/mcp/client.ts
P1 triggers: consumption (auto-invoke) src/skills/triggers.ts, src/agent/loop.ts
P2 /mcp status command + don't swallow stderr src/agent/commands.ts, src/mcp/client.ts

All changes are additive — existing bundled skills, the legacy learnings/store.ts API surface, and the previous stdio-only MCP config all keep working.

MCP

Transport surface

  • StreamableHTTP + SSE client transports alongside stdio. Until now transport: 'http' was declared in the type but rejected at connect time with a not yet supported log line. Hosted MCP servers (Notion, Linear, Asana, Atlassian, Zapier) are reachable now via the standard config shape — transport: 'http', url: 'https://...'.
{
  "mcpServers": {
    "notion-remote": {
      "transport": "http",
      "url": "https://mcp.notion.com/mcp",
      "oauth": true
    }
  }
}

OAuth

  • Implement the SDK's OAuthClientProvider against an on-disk store at ~/.blockrun/mcp/oauth/<server>.json (0600 file mode, 0700 directory).
  • PKCE flow uses SDK helpers; we drive the user-facing pieces:
    • Open the user's browser to the authorization URL.
    • Bind 127.0.0.1:33761 for the callback listener (one-shot).
    • Hand the code to transport.finishAuth(code).
    • Retry connect() once.
  • Tokens auto-refresh via the SDK's AuthorizationSession wrapper.

Per-server tool filtering

  • enabled_tools / disabled_tools mirror Codex's allow/deny lists (codex-rs/config/src/mcp_types.rs:71-79).
  • Applied at discovery time so the model never sees a filtered tool.
  • /mcp reports how many tools each server hid via the filter.
{
  "mcpServers": {
    "notion": {
      "transport": "stdio",
      "command": "npx",
      "args": ["-y", "@notionhq/notion-mcp-server"],
      "enabled_tools": ["API-post-search", "API-retrieve-a-page"]
    }
  }
}

Diagnostics

  • Stop swallowing stdio stderr (was stderr: 'ignore'). We now 'pipe' it, tail the last 30 lines per server, and surface them under /mcp. Misconfigured servers used to look like silent connect timeouts; now the user sees the actual auth/binary/env error.
  • /mcp shows transport kind, tool count, filter count, OAuth state, and per-server failure reason + stderr tail for connections that bombed.

Skills

Unified Registry across four sources

loadAllSkills(workDir) discovers in one pass:

Source Path Precedence
project <workDir>/.franklin/skills/ and <workDir>/.skills/ 4 (highest)
user ~/.blockrun/skills/<name>/SKILL.md 3
learned ~/.blockrun/skills/learned/<name>/SKILL.md 2
bundled dist/skills-bundled/ 1

Name collisions are surfaced via registry.shadowed() so users can see what shadowed what. The old "Phase 2 of the skills MVP" comments that promised user/project discovery "in a follow-up" are now real.

Convergence with the learnings system

Before this PR, ~/.blockrun/skills/ was two parallel systems:

  1. The Anthropic SKILL.md slash-command set under src/skills/
  2. The legacy procedural-memory writer under src/learnings/store.ts — different on-disk format, different loader, injected into every system prompt at boot.

The learnings extractor now writes its output in the same SKILL.md format under ~/.blockrun/skills/learned/<name>/SKILL.md with hidden: true + auto-generated: true. These show up in the unified Registry, participate in trigger matching, but stay out of /help and franklin skills list's headline group.

formatSkillsForPrompt is no longer wired into the boot path. Per-turn trigger matching has replaced "inject top 5 by use count into every system prompt forever".

Trigger consumption (auto-invoke)

A skill's triggers: list was parsed but never consumed anywhere. Now:

  • matchSkillTriggers(input, skills) scores each skill's trigger phrases against the user message. Multi-token phrases get 3 points, single tokens 1, explicit name mention 3, prior use bonus capped at 2. Threshold 4 keeps single-word coincidences (e.g. "buy") from firing trade-signal in random chatter.
  • disableModelInvocation: true skills are excluded.
  • On match, formatSkillHints(matches) is appended to only this turn's system prompt — the skill body becomes guidance, not a destructive rewrite of the user's message. The visible transcript is unchanged.

Verified

  • ~/.blockrun/skills/test-user-skill/SKILL.mdfranklin skills list shows it with source (user).
  • <workDir>/.franklin/skills/test-project-skill/SKILL.md → visible only when CWD is that workDir.
  • ~/.blockrun/skills/learned/test-learned-skill/SKILL.md → visible with source (learned).
  • franklin start --debug -p "verify unified loader and test user skill" emits *[skill triggers] test-user-skill(6.0)* and the model acknowledges the skill hint in its response.
  • /mcp shows notion [stdio] — 22 tools for a hosted Notion MCP, codegraph [stdio] — 10 tools plus stderr tail.
  • 448/448 local tests pass.

Out of scope

  • OAuth was structurally implemented and type-checked but not end-to-end tested live (needs a real hosted MCP server to drive). The shape mirrors OAuthClientProvider semantics; should work, but Vicky/Max please sanity-check the local-callback port choice (33761) and the on-disk token location (~/.blockrun/mcp/oauth/) before users adopt remote MCP servers.
  • franklin mcp login <name> standalone command — stubbed in src/mcp/oauth.ts as loginToMcpServer, but the lazy OAuth path on franklin start is enough for first-time login.
  • /skills command rename / unified /mcp /skills /agents panel — design call.

Test plan

  • npm run build clean
  • npm test — 448/448 pass
  • User-source skill: plant + verify
  • Project-source skill: plant + verify scoped
  • Learned-source skill: plant + verify
  • Trigger match fires + hint reaches model
  • /mcp shows transport, tool count, stderr tail for real servers (Notion + CodeGraph)
  • enabled_tools / disabled_tools filter logic
  • Remote MCP server end-to-end (needs hosted MCP target — Vicky to drive)
  • OAuth callback round-trip (same)

KillerQueen-Z and others added 4 commits June 13, 2026 13:03
… skill registry, trigger auto-invoke

Five long-standing gaps in Franklin's MCP + Skills layers are addressed in a
single refactor. Everything is additive — existing bundled skills, the legacy
`learnings/store.ts` API surface, and the previous stdio-only MCP config all
keep working.

MCP — transport surface
- Add `StreamableHTTP` + `SSE` client transports alongside `stdio`. Until
  now `transport: 'http'` was declared in the type but rejected at connect
  time with a "not yet supported" log line. Hosted MCP servers (Notion,
  Linear, Asana, Atlassian, Zapier) are reachable now via the standard
  config shape — `transport: 'http'`, `url: 'https://...'`.
- Add `oauth: true | { scopes, clientName }` per-server. We implement the
  SDK's `OAuthClientProvider` against an on-disk store at
  `~/.blockrun/mcp/oauth/<server>.json` (0600 + 0700 dir mode). The PKCE
  flow uses the SDK helpers; we drive the user-facing pieces (open the
  browser, bind 127.0.0.1:33761 for the callback, hand the code back via
  `transport.finishAuth`, retry connect once).

MCP — per-server tool filtering
- `enabled_tools` / `disabled_tools` mirror Codex's allow/deny lists.
  Applied at discovery time so the model never sees a filtered tool.
  `/mcp` reports how many tools each server hid via the filter.

MCP — diagnostics
- Stop swallowing stdio stderr (was `stderr: 'ignore'`). We now `'pipe'`
  it, tail the last 30 lines per server, and surface them under `/mcp`.
  Misconfigured servers used to look like silent connect timeouts; now
  the user sees the actual auth/binary/env error.
- `/mcp` shows transport kind, tool count, filter count, OAuth state,
  and per-server failure reason + stderr tail for connections that
  bombed.

Skills — unified Registry across four sources
- `loadAllSkills(workDir)` discovers in one pass:
    bundled       — dist/skills-bundled/
    learned       — ~/.blockrun/skills/learned/<name>/SKILL.md
    user-global   — ~/.blockrun/skills/<name>/SKILL.md
    project-local — <workDir>/.franklin/skills/ and <workDir>/.skills/
  Registry precedence: project > user > learned > bundled. Name
  collisions are surfaced via `registry.shadowed()` so the user can see
  what shadowed what.
- The old "skills MVP Phase 2" comments that promised user/project
  discovery "in a follow-up" are now real.

Skills — convergence with the learnings system
- `~/.blockrun/skills/` was previously two parallel systems: the
  Anthropic SKILL.md slash-command set under `src/skills/`, and the
  legacy procedural-memory writer under `src/learnings/store.ts`. They
  used different on-disk formats, different loaders, and the latter
  was injected into every system prompt at boot.
- The learnings extractor now writes its output in the same SKILL.md
  format under `~/.blockrun/skills/learned/<name>/SKILL.md` with
  `hidden: true` + `auto-generated: true`. These show up in the
  unified Registry, participate in trigger matching, but stay out of
  `/help` and `franklin skills list`'s headline group.
- `formatSkillsForPrompt` is no longer wired into the boot path. Per-
  turn trigger matching has replaced "inject top 5 by use count into
  every system prompt forever".

Skills — trigger consumption
- `matchSkillTriggers(input, skills)` scores each skill's trigger
  phrases against the user message. Multi-token phrases get 3 points,
  single tokens 1, explicit name mention 3, prior use bonus capped at
  2. Threshold 4 keeps single-word coincidences (e.g. "buy") from
  firing trade-signal in random chatter.
- `disableModelInvocation: true` skills are excluded.
- On match, `formatSkillHints(matches)` is appended to ONLY this turn's
  system prompt — the skill body becomes guidance, not a destructive
  rewrite of the user's message. The visible transcript is unchanged.

Verified
- Plant `~/.blockrun/skills/test-user-skill/SKILL.md` →
  `franklin skills list` shows it with source `(user)`.
- Plant `<workDir>/.franklin/skills/test-project-skill/SKILL.md` →
  visible only when CWD is that workDir.
- Plant `~/.blockrun/skills/learned/test-learned-skill/SKILL.md` →
  visible in `franklin skills list` with source `(learned)`.
- `franklin start --debug -p "verify unified loader and test user skill"`
  emits `*[skill triggers] test-user-skill(6.0)*` and the model
  acknowledges the skill hint in its response.
- `/mcp` shows `notion [stdio] — 22 tools` for a hosted Notion MCP,
  `codegraph [stdio] — 10 tools` plus stderr tail.
- 448/448 local tests pass.
- skills: honor hidden flag in /help and `skills list` (was written on every
  learned skill but never read); add --all to reveal, surface hidden count,
  expose hidden/autoGenerated in --json
- learnings: migrate legacy flat ~/.blockrun/skills/<name>.md into the new
  learned/<name>/SKILL.md layout so upgrading users don't lose learned skills
- skills/triggers: frame learned/auto-generated skill bodies as UNTRUSTED in
  hint blocks so distilled session content can't inject via the system prompt
- mcp/oauth: validate the OAuth callback state against the authorization
  request (RFC 6749 CSRF defense) before exchanging the code
- mcp/client: tag MCP tool-call output as UNTRUSTED (mirrors resource path),
  relevant now that remote http/sse servers are reachable; document why the
  stderr drain supersedes the prior 'ignore' fix
- docs: correct registry precedence comment (+learned) and finishAuth attribution
- tests: cover hidden-list behavior, learned-skill untrusted framing, and the
  legacy flat-file migration (451 pass)
…fied skill registry

Sync main (3.28.5) into the branch and bump to 3.29.0. Minor bump for the
MCP + Skills feature set (HTTP/SSE transports, OAuth, per-server tool
filtering, unified four-source skill registry, trigger auto-invoke) plus the
code-review hardening fixes. package-lock realigned (was stale at 3.28.3).
@VickyXAI VickyXAI merged commit 9b33f17 into main Jun 14, 2026
2 checks passed
@VickyXAI VickyXAI deleted the feat/mcp-skills-refactor branch June 14, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants