fix(plugin): kill roster/disposition/skill-count drift (independent review) by avrabe · Pull Request #86 · pulseengine/pulseengine.eu

avrabe · 2026-06-10T19:17:09Z

Five drift findings from an independent review — all the same root cause (a fact copied to N places, N-1 drifted), fixed by single-source + reference:

"four skills" injected every session — the memory-inject hook preamble + philosophy + toolchain memory said "four"; there are thirteen. De-enumerated all three (the skills/ dir is the source). Highest priority — misinformation on a hot path.
Tool roster had no single source and disagreed across 5 files. The toolchain memory is now the complete roster — added gale, scry (load-bearing in proof-synthesis), thrum (real per CLAUDE.md + the thrum dashboard — not an orphan to remove), temper, mcp. report-tool-friction + the situational-awareness hook now reference it / are synced.
Disposition miscategorization (real routing cost) — release-planning / release-artifact-pipeline were "drivers" but are judgment-heavy; routing them to a low-effort biddable model weakens triage/scoping. Reclassified.
"Two kinds" wasn't exhaustive (4 skills unclassified). Now three classes — driver {release-execution} · hybrids {release-planning, issue-hunt, release-artifact-pipeline} · explorers {6} · cross-cutting {oracle-gate, report-tool-friction, capture-session-learnings} — every skill in exactly one.
Verified arXiv 2605.26457 — "Verus-SpecGym", the LLM-judge-misses-26% result is real. Citation correct, kept.

Patch → 0.8.1 (drift/consistency, not new capability).

🤖 Generated with Claude Code

…eview) Five drift findings, all the same root cause (a fact copied to N places, N-1 drifted) — fixed by picking the single source and making the rest reference it: 1. "four skills" injected every session — the inject-pulseengine-memory hook preamble enumerated four skills, and philosophy + toolchain memory said "four"; there are thirteen. De-enumerated all three (the skills/ dir is the source). 2. Tool roster had no single source and disagreed across 5 files. Made the toolchain memory the complete roster: added gale, scry, thrum, temper, mcp (gale + scry are load-bearing in proof-synthesis; thrum is real per CLAUDE.md + the thrum dashboard, not an orphan). report-tool-friction + the situational-awareness hook now reference the roster / are synced (+thrum). 3. Disposition miscategorization (real routing cost). release-planning and release-artifact-pipeline were "drivers" but are judgment-heavy; routing them to a low-effort biddable model weakens triage/scoping. Reclassified. 4. "Two kinds" wasn't exhaustive (4 skills unclassified). Now three classes — driver {release-execution}, hybrids {release-planning, issue-hunt, release-artifact-pipeline}, explorers {6}, cross-cutting {oracle-gate, report-tool-friction, capture-session-learnings} — every skill in exactly one. 5. Verified arXiv 2605.26457 (Verus-SpecGym; the LLM-judge-misses-26% result is real) — citation correct, kept. Patch bump to 0.8.1 (drift/consistency, not new capability). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…urce Review of the field-report pass surfaced that the reachability fix (restating gate rules inline in release-execution and the feature-loop land step) silently contradicts the de-dup principle #86 established — and an unstated principle is the thing that drifts next: a future drift-sweep would "correctly" re-consolidate the inline copies back into the contract, reintroducing the exact bug the field report caught. - operating-contract: add "Single-source by default — restate inline only where absence is unsafe", making the exception explicit policy. The test is unsafe-action vs worse-output; execution-critical safety rules are intentionally redundant and labeled, everything else stays single-source; a drift-sweep must not consolidate the labeled copies. - Label both inline copies (release-execution, feature-loop step 8) as the deliberate reachability-redundancy exception, pointing back to it. - Generalize the exit-code-over-summary lesson: it's any grepped verifier (pytest, kani, verus, CI log), with cargo as the example, not the rule. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(plugin): harden skills from three field reports (v0.9.0) Apply concrete fixes from three real-use field reports (v1.35→v1.55 relay, ~8-pass issue-hunt, v0.22→v0.27 gale) of the v0.8.x skills. The throughline: de-dup discipline left safety-critical gate rules reachable only in the operating-contract memory — unreachable at execution time when that file isn't loaded — and left the merge-wait unowned. release-execution: - Restate the universal gate rules as a self-contained, reachable checklist in the skill (the contract keeps the rationale; the skill carries the operational asserts) instead of deferring entirely to the memory. - Add merge-wait teeth: a PR that merged in seconds didn't wait for checks — that's a red flag, not a convenience. pulseengine-feature-loop: - Step 8 now owns the merge-wait instead of delegating and assuming. - Recurring N/A on steps 5/6 (witness/sigil) is a backlog item, not an exemption — escalate after three features running (this is how a real MC/DC gap and a missing attestation chain stayed hidden). - Make the self-verify cadence concrete: before every tag, every ~2 features. issue-hunt: - Self-echo filter: drop items whose only post-watermark activity is the loop's own comments (filter by author + comment id), so it stops chasing its own tail. - Structured pending_gates state (pr, watcher, action_on_green) so a PR opened-but-unlanded gets owned and merged on a later pass; resolve gates at the top of each pass. - Re-pull from watermark −60s and dedup (exact-boundary clock skew misses). - Verify .claude/ state file is actually git-ignored before writing — it got committed once. operating-contract: - Ground-claims: trust a test exit code, not a grepped "all green" tail; run one cargo at a time. - Campaign invariants as a numbered hard checklist with cadence + the second-order gate-hardening traps (paths-ignore × required checks, strict:true merge train). - New block: degraded infrastructure is not failure — queued ≠ failed, never clear queued main CI as stale, diagnose before re-dispatching. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(plugin): name the reachability-redundancy exception to single-source Review of the field-report pass surfaced that the reachability fix (restating gate rules inline in release-execution and the feature-loop land step) silently contradicts the de-dup principle #86 established — and an unstated principle is the thing that drifts next: a future drift-sweep would "correctly" re-consolidate the inline copies back into the contract, reintroducing the exact bug the field report caught. - operating-contract: add "Single-source by default — restate inline only where absence is unsafe", making the exception explicit policy. The test is unsafe-action vs worse-output; execution-critical safety rules are intentionally redundant and labeled, everything else stays single-source; a drift-sweep must not consolidate the labeled copies. - Label both inline copies (release-execution, feature-loop step 8) as the deliberate reachability-redundancy exception, pointing back to it. - Generalize the exit-code-over-summary lesson: it's any grepped verifier (pytest, kani, verus, CI log), with cargo as the example, not the rule. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

avrabe merged commit 1312c0a into main Jun 10, 2026
1 check passed

avrabe deleted the fix/plugin-roster-disposition-drift branch June 10, 2026 19:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(plugin): kill roster/disposition/skill-count drift (independent review)#86

fix(plugin): kill roster/disposition/skill-count drift (independent review)#86
avrabe merged 1 commit into
mainfrom
fix/plugin-roster-disposition-drift

avrabe commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant