FE-730: Orchestrator POC — dual-engine execution with contract tests#143
FE-730: Orchestrator POC — dual-engine execution with contract tests#143kostandinang wants to merge 21 commits into
Conversation
PR SummaryLow Risk Overview Updates build/runtime hygiene: gitignores Refreshes internal planning docs ( Reviewed by Cursor Bugbot for commit 934ea57. Bugbot is set up for automated code reviews on this repo. Configure here. |
🤖 Augment PR SummarySummary: Adds an Orchestrator POC behind Changes:
Technical Notes: Deterministic verification uses orchestrator-owned 🤖 Was this summary useful? React with 👍 or 👎 |
f41c9d5 to
d18a745
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit dee641a. Configure here.
| import type { Orchestrator, OrchestratorInput, OrchestratorResult } from './types.js'; | ||
|
|
||
| // --------------------------------------------------------------------------- | ||
| // ProceduralOrchestrator — same compiled net, serial firing policy. |
There was a problem hiding this comment.
In the current landed shape, the “procedural” and “petri” engines are sharing the same compiled net and the same serial interpreter; I assume this is a temporary phase 0 thing?
| id: `${sid}:evaluate`, | ||
| inputs: [p(sid, 'spec-ready'), p(sid, 'test-agent')], | ||
| fire: async (consumed) => { | ||
| const reportId = await actions['evaluate-done'](actCtx); |
There was a problem hiding this comment.
I think you've got a good split between control-state in places/tokens and substantive handoff state in reports, but we should keep it around as an open question, whether this balance is ultimately right or whether there is more meta-data/-state that should eventually be pulled in to tokens
| export type TransitionDef = { | ||
| id: string; | ||
| inputs: string[]; | ||
| fire: (consumed: Token[]) => Promise<{ place: string; token: Token }[]>; |
There was a problem hiding this comment.
I see transitions currently declare only their input places structurally; output arcs live inside the imperative fire() closure. That gives the runtime a Petri-ish control shape, but it also means the compiled net is only partially declarative and would not be formally analyzable
| * Create an isolated run directory under `baseDir/.cook/runs/<runId>/`. | ||
| * `baseDir` should be cwd (not the fixture directory) so fixtures stay pristine. | ||
| */ | ||
| export function createWorktree(baseDir: string, runId?: string): WorktreeInfo { |
There was a problem hiding this comment.
worktree currently means “isolated run directory” rather than a Git worktree in the implemented path, it might be good to disambiguate this
lunelson
left a comment
There was a problem hiding this comment.
I think as a POC this is great, just want to make clear it's diverging from true petri-net properties with output places being dynamically determined by the transition. this might need some remodelling otherwise we're precluding some of the theoretical benefits
|
Thanks, this is a fair point. This PR reflects first take of PoC, focusing on the overall arch, rather than deepening on the petri interaction, where we can't do topology-level analysis. FE-738 (petri-semantic-lanes) moved this partway by separating topology compilation from runtime wiring and adding declared output sets to handlers. The remaining work, splitting conditional handlers into explicit graph transitions with declared outputs/guards — is under work and land in subsequent PRs, should address this. Appreciate the early catch here. |
- Add orchestrator capability requirements (R46–R50) to SPEC.md - Add decisions D155-K–D159-K (dual-engine, reports.jsonl, ActionRegistry, plan model, worktree isolation) - Add invariants I121-K–I123-K (contract test parity, token discipline, worktree safety) - Add orchestrator lexicon entries - Add orchestrator-poc frontier definition to PLAN.md - Move design doc to docs/design/orchestrator.md - Update Linear FE-730 description to match design doc Co-authored-by: Amp <amp@ampcode.com>
#1 - types.ts: Plan, Epic, Slice, Orchestrator seam, ReportSink, ActionHandlers - report-sink.ts: InMemoryReportSink (append + query by id) - engine-proc.ts: ProceduralOrchestrator with TDD inner loop, topo-sort, epic-level verification, retry loop - engine-contract.test.ts: 4 tests — status completed, correct outcomes, TDD cycle call order, report sink contents - Code lives under src/orchestrator/ (cook is CLI subcommand name only) - All 4 contract tests pass; npm run verify clean Co-authored-by: Amp <amp@ampcode.com>
… CLI, fixture - plan-loader.ts: YAML parsing with validation (3 tests) - test-runner.ts: BunTestRunner wrapping `bun test` - worktree.ts: createWorktree with .cook/runs/<runId>/worktree/ (2 tests) - file-report-sink.ts: JSONL-backed ReportSink with stdout streaming - pi-actions.ts: createPiActions() dispatching pi CLI for each agent role - prompts/: test-writer.md, code-writer.md, evaluator.md - cook-cli.ts: parseCookArgs + runCook wiring everything together (5 tests) - cli.ts: `brunch cook` command registered alongside agent - fixtures/txt/plan.yaml: Fixture #1 (2 epics, 5 slices) - 34 orchestrator tests pass; build clean with cook-cli chunk
Design doc §8: worktree at <cwd>/.cook/runs/ not <dir>/.cook/runs/ R49, D159-K, I123-K: updated to cwd-scoped worktree Lexicon: worktree entry clarifies cwd-scoped Card 16 scoped: cwd worktree + fixture cleanup Co-authored-by: Amp <amp@ampcode.com>
- cook-cli: structured header/footer with engine, plan, worktree, retries; per-epic/slice result table; total duration - pi-actions: elapsed timer from session start, compact one-line-per-action with icons (▸ start, ✓ done, ✗ fail, ● verdict, ○ needs work, ? evaluate) - file-report-sink: stop streaming raw JSON to stdout; JSON stays in file only - 35 tests pass, build clean
The field was always the agent working directory, not the fixture directory. Also removes unused ReportLine import from engine-proc. Co-authored-by: Amp <amp@ampcode.com>
topoSort<T>(items, getId, getDeps) replaces topoSort(epics) + topoSortSlices(slices). Co-authored-by: Amp <amp@ampcode.com>
report-helpers.ts: createReport(sink, fields) handles id generation + timestamp + append. Replaces 5 inline report-construction sites across engine-proc, engine-petri, and pi-actions. Co-authored-by: Amp <amp@ampcode.com>
Delete old module-level callOrder/evalCallCount/fakeActions/fakeTestRunner. All 9 contract test suites now use the same createFakes() factory. ~100 lines removed. Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
…sults - Status banner: landed POC with SPEC cross-references - §2 seam: fixtureDir → worktreeDir, ActionRegistry → ActionHandlers - §3: POC note pointing to §12 deferral - §12: streaming UX row updated (implemented, not deferred) - §13: experiment results with verdict (proc wins on simplicity) Co-authored-by: Amp <amp@ampcode.com>
…propagation, verify-epic parity - cook-cli: validate --max-retries is finite non-negative (prevents NaN infinite loop) - engine-petri: epic deps use single transition with ALL dep-done places as inputs (was one transition per dep → fired on first dep instead of all) - engine-petri: PetriNet.run() accepts shouldHalt callback, checked each iteration (was ignoring ctx.halted so transitions kept firing after a halt) - engine-proc: verify-epic called once per epic, not once per verification entry (handler owns all targets; matches petri engine behavior) Co-authored-by: Amp <amp@ampcode.com>
- engine-petri: unreached slices/epics now set ctx.halted=true so the overall status correctly reports 'halted' instead of 'completed' - report-helpers: append monotonic sequence counter to IDs to prevent collisions when multiple reports are created in the same millisecond Co-authored-by: Amp <amp@ampcode.com>
- engine-petri: epic deps use per-dependent signal places (same pattern as slice deps) so multiple epics depending on the same predecessor each get their own token instead of competing for one - engine-proc: haltedResult() fills in unreached epics/slices as halted before returning, matching petri engine behavior Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
Co-authored-by: Amp <amp@ampcode.com>
- engine-proc: hoist reportIds to run() scope so catch preserves them - engine-petri: remove dead ep(epicId, 'ready') place (readiness fans out directly to slice eligible places) - pi-actions: verify-epic write step uses test-writer.md, not evaluator.md Co-authored-by: Amp <amp@ampcode.com>
- Extract PetriNet class, Token, TransitionDef, FiringPolicy into petri-net.ts - Extract compilePlan() and RunCtx into net-compiler.ts - Both engines now call shared compilePlan() with serial firing policy - Migrate retry state from ctx.retries Map into in-net retry-budget places - Add adapter tests pinning compiled net place/transition counts - engine-petri.ts and engine-proc.ts are thin wrappers (~65 LOC each) Amp-Thread-ID: https://ampcode.com/threads/T-019e4b4e-1543-7602-b99d-c32342fb3938 Co-authored-by: Amp <amp@ampcode.com>
…oped - Mark orchestrator-poc done in PLAN.md (Phase 0 complete) - Add petri-semantic-lanes and petri-parallel-execution frontier definitions - Add petri-graph-compilation and petri-simulation-oracle to Horizon - Add Track F dependency graph for H-6476 umbrella - Scope Card 1-3 queue in CARDS.md for petri-semantic-lanes Amp-Thread-ID: https://ampcode.com/threads/T-019e4b4e-1543-7602-b99d-c32342fb3938 Co-authored-by: Amp <amp@ampcode.com>
b72d1ad to
934ea57
Compare


Orchestrator POC. First take at representing orchestration on a minimal Petri-net structure — the goal is to validate the substrate on a simple fixture and evolve from there with more complex plans, richer action types, and parallel execution. Two interchangeable engines behind a shared seam, driven test-first with fake agents and validated end-to-end with real pi/Sonnet. The plan schema is speculative — brunch does not yet emit execution plans; the YAML shape is forward-compatible and will sharpen as canonical plan output lands.
CLI
Architecture
graph TD CLI["brunch cook <dir>"] --> Loader["Plan Loader"] Loader --> Engine subgraph Engine["Orchestrator Interface"] Proc["proc engine"] Petri["petri engine"] end Engine --> Actions["Action Handlers"] Engine --> Runner["Test Runner"] Actions --> Pi["pi CLI"] Engine --> Reports["reports.jsonl"] Engine --> WT["Worktree"]Per-slice TDD loop
graph TD subgraph "Petri net inner loop" Ready(("spec\nready")) -->|evaluate| NM(("needs\nmore")) Ready -->|evaluate| Done(("done")) NM -->|write-tests| FT(("failing\ntests")) FT -->|write-code| UC(("untested\ncode")) UC -->|"run-tests ✓"| Ready UC -->|"run-tests ✗"| FT Done --> Comp(("completed")) endKey decisions
ActionRegistrydeferred until a 3rd action type landsreports.jsonlas communication medium — petri enforces token-pointer discipline; proc passes data normally; shared seam is inputs/outputs<cwd>/.cook/runs/<runId>/, fixtures stay pristine--mode text;extractJson()parses evaluator responsesVerification
txtCLI from empty worktreeProc vs Petri
Promise.all— structural changeVerdict: Proc wins on simplicity and debuggability. Petri earns its complexity only when parallel execution or dynamic replanning enters scope. More tests to be done with the petri-net to understand more on parallelism._
Out of scope
Milestones, resumability, parallel execution, brownfield seed,
ActionRegistryabstraction, dynamic replanning.Fixture:
txtGreenfield TypeScript CLI built from nothing — 2 epics, 5 slices:
--version,--help(lists subcommands)reverse,count,slugifyExercises: happy-path TDD cycles, intra-epic slice deps (
help-flagwaits onversion-flag), inter-epic deps (text-opswaits onscaffolding), epic-level integration verification, and the retry loop (slugify edge cases).Running examples
TBD — sample CLI output and fixture run recordings to be added.
Reports.jsonl
{"id":"rpt-evaluator-version-flag-1779363759893-0","ts":"2026-05-21T11:42:39.893Z","epicId":"scaffolding","sliceId":"version-flag","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"The verification target `tests/version.test.ts` does not exist in the worktree. The worktree directory is empty (only contains `.` and `..`). No test file has been created to verify the `--version` flag implementation. To satisfy this slice, a test file must be created at `tests/version.test.ts` that defines and tests the functionality of adding a `--version` flag that prints the version from package.json, and those tests must pass when run with `bun test`."}} {"id":"rpt-test-writer-version-flag-1779363997897-1","ts":"2026-05-21T11:46:37.898Z","epicId":"scaffolding","sliceId":"version-flag","actor":"test-writer","event":"tests-written","payload":{"sliceId":"version-flag","targets":["tests/version.test.ts"]}} {"id":"rpt-code-writer-version-flag-1779364039448-2","ts":"2026-05-21T11:47:19.449Z","epicId":"scaffolding","sliceId":"version-flag","actor":"code-writer","event":"code-written","payload":{"sliceId":"version-flag"}} {"id":"rpt-test-runner-version-flag-1779364040678-3","ts":"2026-05-21T11:47:20.678Z","epicId":"scaffolding","sliceId":"version-flag","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-version-flag-1779364062168-4","ts":"2026-05-21T11:47:42.168Z","epicId":"scaffolding","sliceId":"version-flag","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 7 tests in tests/version.test.ts pass successfully. The implementation correctly handles the --version flag specification:\n\n✅ Exits with code 0 when --version is passed\n✅ Prints the version from package.json to stdout\n✅ Does not write to stderr\n✅ Does not launch web UI or show help banners\n✅ Supports -V short flag variant\n✅ Version output is semver-formatted\n✅ Version output is single line\n\nThe implementation in src/server/cli.ts correctly reads from package.json and outputs the version before exiting. All verification targets are satisfied."}} {"id":"rpt-evaluator-help-flag-1779364081347-5","ts":"2026-05-21T11:48:01.348Z","epicId":"scaffolding","sliceId":"help-flag","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"Verification target `tests/help.test.ts` does not exist. The test file that should validate the --help flag functionality (listing subcommands: reverse, count, slugify) is missing entirely. No tests can be run to verify the implementation. The slice specification requires both the implementation of the --help flag and the corresponding test coverage."}} {"id":"rpt-test-writer-help-flag-1779364118018-6","ts":"2026-05-21T11:48:38.018Z","epicId":"scaffolding","sliceId":"help-flag","actor":"test-writer","event":"tests-written","payload":{"sliceId":"help-flag","targets":["tests/help.test.ts"]}} {"id":"rpt-code-writer-help-flag-1779364162780-7","ts":"2026-05-21T11:49:22.781Z","epicId":"scaffolding","sliceId":"help-flag","actor":"code-writer","event":"code-written","payload":{"sliceId":"help-flag"}} {"id":"rpt-test-runner-help-flag-1779364164386-8","ts":"2026-05-21T11:49:24.386Z","epicId":"scaffolding","sliceId":"help-flag","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-help-flag-1779364183312-9","ts":"2026-05-21T11:49:43.312Z","epicId":"scaffolding","sliceId":"help-flag","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 10 tests in tests/help.test.ts pass. The verification targets comprehensively cover the slice specification: --help flag exits with code 0, prints help text to stdout, lists all three subcommands (reverse, count, slugify), supports -h short flag, does not launch the web UI, and includes a usage/commands header. No test failures or missing tests."}} {"id":"rpt-orchestrator-version-flag-1779364276953-10","ts":"2026-05-21T11:51:16.953Z","epicId":"scaffolding","sliceId":"version-flag","actor":"orchestrator","event":"epic-verified","payload":{"passed":true}} {"id":"rpt-evaluator-reverse-1779364313167-11","ts":"2026-05-21T11:51:53.167Z","epicId":"text-ops","sliceId":"reverse","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"The verification target file 'tests/reverse.test.ts' does not exist in the worktree. Additionally, the reverse subcommand is listed in the help output but has no actual implementation in src/server/cli.ts — there is no handler for rawArgs[0] === 'reverse', no pure string reversal function, and no wiring to argv[2]. The specification requires tests to exist and pass, but the test file is completely missing, making it impossible to verify the slice is satisfied."}} {"id":"rpt-test-writer-reverse-1779364369807-12","ts":"2026-05-21T11:52:49.807Z","epicId":"text-ops","sliceId":"reverse","actor":"test-writer","event":"tests-written","payload":{"sliceId":"reverse","targets":["tests/reverse.test.ts"]}} {"id":"rpt-code-writer-reverse-1779364421099-13","ts":"2026-05-21T11:53:41.099Z","epicId":"text-ops","sliceId":"reverse","actor":"code-writer","event":"code-written","payload":{"sliceId":"reverse"}} {"id":"rpt-test-runner-reverse-1779364422488-14","ts":"2026-05-21T11:53:42.488Z","epicId":"text-ops","sliceId":"reverse","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-reverse-1779364440832-15","ts":"2026-05-21T11:54:00.832Z","epicId":"text-ops","sliceId":"reverse","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 22 tests in tests/reverse.test.ts pass successfully. The test suite verifies: (1) the pure `reverse()` function is exported and correctly reverses strings of all types (ASCII, unicode-compatible, with spaces/numbers/punctuation, palindromes, empty strings, single chars), (2) the CLI `reverse` subcommand exits with code 0, (3) the subcommand reads from argv[2] and outputs the reversed string to stdout, (4) output is newline-terminated with no extra lines, and (5) stderr is empty on normal invocation. All specification requirements are satisfied."}} {"id":"rpt-evaluator-count-1779364466667-16","ts":"2026-05-21T11:54:26.668Z","epicId":"text-ops","sliceId":"count","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"The verification target file `tests/count.test.ts` does not exist. The slice specification requires a `count` subcommand that counts whitespace-separated words with empty input returning 0, but there is no test file to verify this functionality. The worktree contains only: cli-scaffolding.integration.test.ts, help.test.ts, reverse.test.ts, and version.test.ts. Neither the test file nor any implementation of the count function exists."}} {"id":"rpt-test-writer-count-1779364575496-17","ts":"2026-05-21T11:56:15.496Z","epicId":"text-ops","sliceId":"count","actor":"test-writer","event":"tests-written","payload":{"sliceId":"count","targets":["tests/count.test.ts"]}} {"id":"rpt-code-writer-count-1779364636796-18","ts":"2026-05-21T11:57:16.796Z","epicId":"text-ops","sliceId":"count","actor":"code-writer","event":"code-written","payload":{"sliceId":"count"}} {"id":"rpt-test-runner-count-1779364638241-19","ts":"2026-05-21T11:57:18.242Z","epicId":"text-ops","sliceId":"count","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-count-1779364663762-20","ts":"2026-05-21T11:57:43.762Z","epicId":"text-ops","sliceId":"count","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 26 tests in tests/count.test.ts pass successfully. The implementation includes: (1) a pure count() function exported from src/server/count.ts that counts whitespace-separated words and returns 0 for empty/whitespace-only input, (2) a count CLI subcommand that reads argv[2] and prints the word count to stdout with proper exit code and formatting. Test coverage includes pure function behavior (empty strings, whitespace delimiters, leading/trailing whitespace, tabs, newlines, long inputs) and CLI behavior (exit codes, stdout output, empty arguments, whitespace handling, output formatting). No failing tests."}} {"id":"rpt-evaluator-slugify-1779364682705-21","ts":"2026-05-21T11:58:02.705Z","epicId":"text-ops","sliceId":"slugify","actor":"evaluator","event":"eval-done","payload":{"done":false,"reasoning":"The verification target tests/slugify.test.ts does not exist. The test file is required to validate the slice specification, but it is missing from the tests/ directory. The current directory only contains: cli-scaffolding.integration.test.ts, count.test.ts, help.test.ts, reverse.test.ts, and version.test.ts. No implementation of the slugify subcommand or its tests have been created."}} {"id":"rpt-test-writer-slugify-1779364751272-22","ts":"2026-05-21T11:59:11.274Z","epicId":"text-ops","sliceId":"slugify","actor":"test-writer","event":"tests-written","payload":{"sliceId":"slugify","targets":["tests/slugify.test.ts"]}} {"id":"rpt-code-writer-slugify-1779364804214-23","ts":"2026-05-21T12:00:04.214Z","epicId":"text-ops","sliceId":"slugify","actor":"code-writer","event":"code-written","payload":{"sliceId":"slugify"}} {"id":"rpt-test-runner-slugify-1779364806089-24","ts":"2026-05-21T12:00:06.089Z","epicId":"text-ops","sliceId":"slugify","actor":"test-runner","event":"tests-run","payload":{"passed":true,"output":"bun test v1.3.0 (b0a6feca)\n"}} {"id":"rpt-evaluator-slugify-1779364851082-25","ts":"2026-05-21T12:00:51.082Z","epicId":"text-ops","sliceId":"slugify","actor":"evaluator","event":"eval-done","payload":{"done":true,"reasoning":"All 46 tests in tests/slugify.test.ts pass. The implementation satisfies the slice specification:\n\n1. Pure function tests (24 tests) verify:\n - Lowercasing (all-uppercase, mixed-case)\n - Non-alphanumeric replacement with dashes (spaces, hyphens, underscores, dots, special chars)\n - Dash collapsing (consecutive dashes, mixed separators)\n - Leading/trailing dash trimming\n - Numeric preservation\n - Edge cases (empty strings, whitespace-only, special-char-only)\n\n2. Unicode diacritic tests (10 tests) verify:\n - Diacritic stripping for é, ü, ö, à, Ñ, and others\n - Combined diacritic + case handling\n\n3. CLI subcommand tests (12 tests) verify:\n - Exit code 0 on success\n - Correct stdout output (slug on single line with newline termination)\n - No stderr output on normal invocation\n - All slugify behaviors work through CLI interface\n\nImplementation in src/server/slugify.ts uses Unicode NFD normalization + combining-mark removal for diacritics, followed by the required transformations. CLI integration in src/server/cli.ts correctly handles the 'slugify' subcommand."}} {"id":"rpt-orchestrator-reverse-1779364971688-26","ts":"2026-05-21T12:02:51.688Z","epicId":"text-ops","sliceId":"reverse","actor":"orchestrator","event":"epic-verified","payload":{"passed":true}}