feat(plugin): model-operating-contract for capable/agentic models; bump 0.6.0 by avrabe · Pull Request #80 · pulseengine/pulseengine.eu

avrabe · 2026-06-10T18:42:06Z

Rethinks how the skills run on capable, agentic models (Claude Fable 5 and successors), grounded in Anthropic's official Prompting Claude Fable 5 guide. The skills were Opus-era-calibrated (biddable); capable models infer intent and "helpfully" deviate — unrequested actions, "verified" without the tool result, ending a turn on a promise. The fix is a model-aware contract layer, not more per-skill enumeration.

New memory/pulseengine-operating-contract.md (auto-injected once by the existing SessionStart hook): ground every progress claim in a tool result · never merge around a red/absent gate (verified only on green CI) · assessment-is-a-deliverable boundaries · finish the turn you promised · minimal scope · don't transcribe reasoning into output (route visibility through adaptive-thinking blocks + artifacts — it trips a silent Opus fallback otherwise) · classifier-fallback watch on crypto/sigil work · deterministic-driver vs explorer disposition + model routing (drivers: medium effort, biddable model; explorers: higher effort, most capable model, self-verify interval with fresh-context subagents).

Hardens release-execution (the driver where the merge-around-red / false-verified failure occurred) with a skill-specific block that references the contract for the universal rules (no duplication).

Incorporates an external review: de-duplicated the gate rules (contract is the single source), softened the effort framing (medium, drop to low if over-deliberating; the real anti-drift guard is the boundary blocks), reframed the context-budget reassurance as a harness note, named the sanctioned reasoning-visibility channel, and added the explorer self-verification interval.

Notably the guide validates two things already built: clean-room-verification is its "fresh-context verifier subagents > self-critique", and capture-session-learnings + the memory hooks are its recommended memory system. Audited skills for the show-your-reasoning anti-pattern — none present. → 0.6.0, fourth memory file.

🤖 Generated with Claude Code

…mp 0.6.0 Rethinks how the skills run on capable, agentic models (Claude Fable 5 and successors), grounded in Anthropic's official "Prompting Claude Fable 5" guide. The skills were Opus-era-calibrated (biddable); capable models infer intent and "helpfully" deviate — take unrequested actions, report "verified" without the tool result, end a turn on a promise without the tool call. The fix is a model-aware contract layer, not more per-skill enumeration. New memory/pulseengine-operating-contract.md (auto-injected once by the existing SessionStart hook), with the guide's universal blocks adapted to PulseEngine: - Ground every progress claim in a tool result (the prompt-layer twin of oracle-gating; the fix for fabricated "verified" status). - Never merge around a red/absent gate; report a release verified only on green CI; the turn ends at "pushed, and checks requested". - Boundaries: assessment is a deliverable when the user is thinking out loud; check evidence before state-changing commands. - Finish the turn you promised; minimal scope / no uninvited tidying. - Don't transcribe reasoning into output (trips reasoning_extraction → silent Opus fallback); watch for classifier fallback on crypto/sigil work. - Skill disposition + model routing: deterministic drivers (release-*) on a biddable model at low effort; explorers (proof-synthesis, stpa-audit, traceability-audit, feature-loop, clean-room, bootstrap) on the most capable model at higher effort. Hardens release-execution (the driver where the merge-around-red / false-verified failure occurred) with an explicit operating-discipline block. Audited skills for the show-your-reasoning anti-pattern — none present. Notably the guide validates two things already built: clean-room-verification IS its "fresh-context verifier subagents > self-critique", and capture-session-learnings + the memory hooks ARE its recommended memory system. plugin + marketplace bumped to 0.6.0; fourth memory file registered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…t/reasoning Incorporates external review of the operating contract: - De-duplicate: release-execution now REFERENCES the contract for the universal gate rules (never-merge-around-red, verified-only-on-green-CI, ground-claims) instead of restating them — the contract is the single source; the skill keeps only its skill-specific discipline (fail-stops-turn, turn-ends-at-"pushed and checks requested", driver effort). Honors the guide's own update-don't-duplicate. - Effort: drivers framed as routine work at medium effort, drop to low only if it over-deliberates (not "low is correct up front"); the real anti-drift guard is the boundary + minimal-scope blocks, which hold at any effort. - Context-budget reassurance reframed as a harness note (hide the token countdown; reassurance is the fallback) — not a standing prompt instruction. - Reasoning-transcription rule now names the sanctioned route: route reasoning visibility through adaptive-thinking `thinking` blocks + rivet/proof artifacts; never reintroduce "explain your verification in output". - Explorers gain the guide's self-verification interval (check your work at an interval with fresh-context subagents = clean-room-verification). Per the de-dup principle, no parallel blocks added to release-planning / release-artifact-pipeline — the contract covers the disposition for all skills. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…plorers One-line pointers (not duplicated blocks) on proof-synthesis and pulseengine-feature-loop — the genuinely long-running explorers where the guide's self-verification interval is most load-bearing — referencing the operating contract + clean-room-verification (fresh-context subagents) rather than restating. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

avrabe and others added 4 commits June 10, 2026 20:34

docs(plugin): align README effort phrasing with the softened contract

3e9d38d

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

avrabe merged commit 1de0886 into main Jun 10, 2026

avrabe deleted the feat/plugin-operating-contract branch June 10, 2026 18:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(plugin): model-operating-contract for capable/agentic models; bump 0.6.0#80

feat(plugin): model-operating-contract for capable/agentic models; bump 0.6.0#80
avrabe merged 4 commits into
mainfrom
feat/plugin-operating-contract

avrabe commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant