feat(plugin): model-operating-contract for capable/agentic models; bump 0.6.0#80
Merged
Conversation
…mp 0.6.0 Rethinks how the skills run on capable, agentic models (Claude Fable 5 and successors), grounded in Anthropic's official "Prompting Claude Fable 5" guide. The skills were Opus-era-calibrated (biddable); capable models infer intent and "helpfully" deviate — take unrequested actions, report "verified" without the tool result, end a turn on a promise without the tool call. The fix is a model-aware contract layer, not more per-skill enumeration. New memory/pulseengine-operating-contract.md (auto-injected once by the existing SessionStart hook), with the guide's universal blocks adapted to PulseEngine: - Ground every progress claim in a tool result (the prompt-layer twin of oracle-gating; the fix for fabricated "verified" status). - Never merge around a red/absent gate; report a release verified only on green CI; the turn ends at "pushed, and checks requested". - Boundaries: assessment is a deliverable when the user is thinking out loud; check evidence before state-changing commands. - Finish the turn you promised; minimal scope / no uninvited tidying. - Don't transcribe reasoning into output (trips reasoning_extraction → silent Opus fallback); watch for classifier fallback on crypto/sigil work. - Skill disposition + model routing: deterministic drivers (release-*) on a biddable model at low effort; explorers (proof-synthesis, stpa-audit, traceability-audit, feature-loop, clean-room, bootstrap) on the most capable model at higher effort. Hardens release-execution (the driver where the merge-around-red / false-verified failure occurred) with an explicit operating-discipline block. Audited skills for the show-your-reasoning anti-pattern — none present. Notably the guide validates two things already built: clean-room-verification IS its "fresh-context verifier subagents > self-critique", and capture-session-learnings + the memory hooks ARE its recommended memory system. plugin + marketplace bumped to 0.6.0; fourth memory file registered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t/reasoning Incorporates external review of the operating contract: - De-duplicate: release-execution now REFERENCES the contract for the universal gate rules (never-merge-around-red, verified-only-on-green-CI, ground-claims) instead of restating them — the contract is the single source; the skill keeps only its skill-specific discipline (fail-stops-turn, turn-ends-at-"pushed and checks requested", driver effort). Honors the guide's own update-don't-duplicate. - Effort: drivers framed as routine work at medium effort, drop to low only if it over-deliberates (not "low is correct up front"); the real anti-drift guard is the boundary + minimal-scope blocks, which hold at any effort. - Context-budget reassurance reframed as a harness note (hide the token countdown; reassurance is the fallback) — not a standing prompt instruction. - Reasoning-transcription rule now names the sanctioned route: route reasoning visibility through adaptive-thinking `thinking` blocks + rivet/proof artifacts; never reintroduce "explain your verification in output". - Explorers gain the guide's self-verification interval (check your work at an interval with fresh-context subagents = clean-room-verification). Per the de-dup principle, no parallel blocks added to release-planning / release-artifact-pipeline — the contract covers the disposition for all skills. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…plorers One-line pointers (not duplicated blocks) on proof-synthesis and pulseengine-feature-loop — the genuinely long-running explorers where the guide's self-verification interval is most load-bearing — referencing the operating contract + clean-room-verification (fresh-context subagents) rather than restating. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rethinks how the skills run on capable, agentic models (Claude Fable 5 and successors), grounded in Anthropic's official Prompting Claude Fable 5 guide. The skills were Opus-era-calibrated (biddable); capable models infer intent and "helpfully" deviate — unrequested actions, "verified" without the tool result, ending a turn on a promise. The fix is a model-aware contract layer, not more per-skill enumeration.
New
memory/pulseengine-operating-contract.md(auto-injected once by the existing SessionStart hook): ground every progress claim in a tool result · never merge around a red/absent gate (verified only on green CI) · assessment-is-a-deliverable boundaries · finish the turn you promised · minimal scope · don't transcribe reasoning into output (route visibility through adaptive-thinking blocks + artifacts — it trips a silent Opus fallback otherwise) · classifier-fallback watch on crypto/sigil work · deterministic-driver vs explorer disposition + model routing (drivers: medium effort, biddable model; explorers: higher effort, most capable model, self-verify interval with fresh-context subagents).Hardens
release-execution(the driver where the merge-around-red / false-verified failure occurred) with a skill-specific block that references the contract for the universal rules (no duplication).Incorporates an external review: de-duplicated the gate rules (contract is the single source), softened the effort framing (medium, drop to low if over-deliberating; the real anti-drift guard is the boundary blocks), reframed the context-budget reassurance as a harness note, named the sanctioned reasoning-visibility channel, and added the explorer self-verification interval.
Notably the guide validates two things already built:
clean-room-verificationis its "fresh-context verifier subagents > self-critique", andcapture-session-learnings+ the memory hooks are its recommended memory system. Audited skills for the show-your-reasoning anti-pattern — none present. → 0.6.0, fourth memory file.🤖 Generated with Claude Code