Skip to content

feat(plugin): model-operating-contract for capable/agentic models; bump 0.6.0#80

Merged
avrabe merged 4 commits into
mainfrom
feat/plugin-operating-contract
Jun 10, 2026
Merged

feat(plugin): model-operating-contract for capable/agentic models; bump 0.6.0#80
avrabe merged 4 commits into
mainfrom
feat/plugin-operating-contract

Conversation

@avrabe

@avrabe avrabe commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Rethinks how the skills run on capable, agentic models (Claude Fable 5 and successors), grounded in Anthropic's official Prompting Claude Fable 5 guide. The skills were Opus-era-calibrated (biddable); capable models infer intent and "helpfully" deviate — unrequested actions, "verified" without the tool result, ending a turn on a promise. The fix is a model-aware contract layer, not more per-skill enumeration.

New memory/pulseengine-operating-contract.md (auto-injected once by the existing SessionStart hook): ground every progress claim in a tool result · never merge around a red/absent gate (verified only on green CI) · assessment-is-a-deliverable boundaries · finish the turn you promised · minimal scope · don't transcribe reasoning into output (route visibility through adaptive-thinking blocks + artifacts — it trips a silent Opus fallback otherwise) · classifier-fallback watch on crypto/sigil work · deterministic-driver vs explorer disposition + model routing (drivers: medium effort, biddable model; explorers: higher effort, most capable model, self-verify interval with fresh-context subagents).

Hardens release-execution (the driver where the merge-around-red / false-verified failure occurred) with a skill-specific block that references the contract for the universal rules (no duplication).

Incorporates an external review: de-duplicated the gate rules (contract is the single source), softened the effort framing (medium, drop to low if over-deliberating; the real anti-drift guard is the boundary blocks), reframed the context-budget reassurance as a harness note, named the sanctioned reasoning-visibility channel, and added the explorer self-verification interval.

Notably the guide validates two things already built: clean-room-verification is its "fresh-context verifier subagents > self-critique", and capture-session-learnings + the memory hooks are its recommended memory system. Audited skills for the show-your-reasoning anti-pattern — none present. → 0.6.0, fourth memory file.

🤖 Generated with Claude Code

avrabe and others added 4 commits June 10, 2026 20:34
…mp 0.6.0

Rethinks how the skills run on capable, agentic models (Claude Fable 5 and
successors), grounded in Anthropic's official "Prompting Claude Fable 5" guide.
The skills were Opus-era-calibrated (biddable); capable models infer intent and
"helpfully" deviate — take unrequested actions, report "verified" without the
tool result, end a turn on a promise without the tool call. The fix is a
model-aware contract layer, not more per-skill enumeration.

New memory/pulseengine-operating-contract.md (auto-injected once by the existing
SessionStart hook), with the guide's universal blocks adapted to PulseEngine:
- Ground every progress claim in a tool result (the prompt-layer twin of
  oracle-gating; the fix for fabricated "verified" status).
- Never merge around a red/absent gate; report a release verified only on green
  CI; the turn ends at "pushed, and checks requested".
- Boundaries: assessment is a deliverable when the user is thinking out loud;
  check evidence before state-changing commands.
- Finish the turn you promised; minimal scope / no uninvited tidying.
- Don't transcribe reasoning into output (trips reasoning_extraction → silent
  Opus fallback); watch for classifier fallback on crypto/sigil work.
- Skill disposition + model routing: deterministic drivers (release-*) on a
  biddable model at low effort; explorers (proof-synthesis, stpa-audit,
  traceability-audit, feature-loop, clean-room, bootstrap) on the most capable
  model at higher effort.

Hardens release-execution (the driver where the merge-around-red / false-verified
failure occurred) with an explicit operating-discipline block. Audited skills for
the show-your-reasoning anti-pattern — none present.

Notably the guide validates two things already built: clean-room-verification IS
its "fresh-context verifier subagents > self-critique", and
capture-session-learnings + the memory hooks ARE its recommended memory system.

plugin + marketplace bumped to 0.6.0; fourth memory file registered.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t/reasoning

Incorporates external review of the operating contract:
- De-duplicate: release-execution now REFERENCES the contract for the universal
  gate rules (never-merge-around-red, verified-only-on-green-CI, ground-claims)
  instead of restating them — the contract is the single source; the skill keeps
  only its skill-specific discipline (fail-stops-turn, turn-ends-at-"pushed and
  checks requested", driver effort). Honors the guide's own update-don't-duplicate.
- Effort: drivers framed as routine work at medium effort, drop to low only if it
  over-deliberates (not "low is correct up front"); the real anti-drift guard is
  the boundary + minimal-scope blocks, which hold at any effort.
- Context-budget reassurance reframed as a harness note (hide the token countdown;
  reassurance is the fallback) — not a standing prompt instruction.
- Reasoning-transcription rule now names the sanctioned route: route reasoning
  visibility through adaptive-thinking `thinking` blocks + rivet/proof artifacts;
  never reintroduce "explain your verification in output".
- Explorers gain the guide's self-verification interval (check your work at an
  interval with fresh-context subagents = clean-room-verification).

Per the de-dup principle, no parallel blocks added to release-planning /
release-artifact-pipeline — the contract covers the disposition for all skills.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…plorers

One-line pointers (not duplicated blocks) on proof-synthesis and
pulseengine-feature-loop — the genuinely long-running explorers where the guide's
self-verification interval is most load-bearing — referencing the operating
contract + clean-room-verification (fresh-context subagents) rather than restating.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@avrabe avrabe merged commit 1de0886 into main Jun 10, 2026
@avrabe avrabe deleted the feat/plugin-operating-contract branch June 10, 2026 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant