Skip to content

feat(seal-colony): v0 prototype — TUI + CLI + permissions + SQLite + Mode A push queue + e2e tests (SEA-700)#407

Open
mattwilkinsonn wants to merge 10 commits into
mainfrom
sea-711-colony-in-daemon--poseidon
Open

feat(seal-colony): v0 prototype — TUI + CLI + permissions + SQLite + Mode A push queue + e2e tests (SEA-700)#407
mattwilkinsonn wants to merge 10 commits into
mainfrom
sea-711-colony-in-daemon--poseidon

Conversation

@mattwilkinsonn

@mattwilkinsonn mattwilkinsonn commented May 30, 2026

Copy link
Copy Markdown
Contributor

Pull request

Summary

Related issues

Changes

Test plan

Screenshots

Notes for reviewers


View with Codesmith Autofix with Codesmith
Need help on this PR? Tag @codesmith with what you need. Autofix is disabled.

Copy link
Copy Markdown
Contributor Author

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • Merge Queue - adds this PR to the back of the merge queue
  • Merge Queue Fast Track - for urgent changes, fast-track this PR to the front of the merge queue

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has required the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

mattwilkinsonn and others added 10 commits May 30, 2026 23:35
…new seal-runtime crate (SEA-712)

Create empty seal-runtime/ crate skeleton. Move every src file
except main.rs + serve.rs and every test from seal-daemon to
seal-runtime. No import edits — this commit deliberately leaves
the build broken; the next commit fixes Cargo.toml + import
paths.

Per SEAL.md the bookmark TIP must pass CI; intermediate commits
in a long file-move can be red. Splitting the move from the
import fixes makes the diff per commit reviewable:

- This commit: the rename layer. jj detects every file move as
  a rename, so the diff renders as `R old => new`.
- Next commit: the wire-up layer. Cargo.toml split, lib.rs
  facade, internal `seal_daemon::` references rewritten to
  `seal_runtime::`, external callers updated.

Files NOT moved: src/main.rs (binary entry point) and
src/serve.rs (UDS listener + tonic Server registration). Those
stay because they're what makes `seal-daemon` a daemon — the
binary that owns the socket and registers services. Everything
load-bearing (engine, grpc handlers, session driver, llm,
sccache, sandbox, tools, ...) lives in seal-runtime now.

Co-Authored-By: seal <noreply@sealedsecurity.com>
…oml + imports (SEA-712)

Companion to the prior "commit A — move source files" commit.
After file moves, the workspace was uncompileable; this commit
restores green by:

seal-daemon shrinks to the binary entry point + UDS serve loop:

- New Cargo.toml lists only the deps main.rs + serve.rs need
  (tokio, tonic, tracing, daemonize, rustls, seal-runtime, etc.).
  ~80 lines of deps moved out.
- New src/lib.rs is one line: `pub mod serve;`. No re-export
  of seal_runtime — callers thread `seal_runtime::…` paths
  explicitly so each crate's import surface stays honest.
- serve.rs imports flip `crate::engine::…` / `crate::grpc::…` /
  `crate::time::…` / `crate::runtime_identity::…` to
  `seal_runtime::…`.

seal-runtime's public surface is widened just enough to host an
external `serve.rs`:

- `DaemonHandle`, `DaemonState`, `SessionEntry` go from
  `pub(crate)` to `pub`. Their fields stay `pub(crate)` except
  for the three serve.rs actually touches: `DaemonHandle.state`,
  `DaemonHandle.dashboard_bus`, `SessionEntry.mitm_ca`.
- `DashboardBus` struct + `publish_heartbeat` go `pub`.
- `grpc::dashboard_bus` and `grpc::state` modules go `pub`.
- `DaemonHandle::new` goes `pub`.
- The rest of the runtime stays internal — handler bodies,
  the session driver, the engine internals, llm wiring, etc.
  all use crate-internal paths and stay `pub(crate)`.

Tracing target rewrites (codesmith review item 1):

- Every `target: "seal_daemon::…"` in moved code rewritten to
  `target: "seal_runtime::…"` (18 sites across grpc/, scope/,
  sccache_supervisor/, tools/, sandbox_spawn/). Without this,
  the `RUST_LOG=seal_runtime::cancel=debug` filter in ci.yml +
  nightly.yml would match zero events and the SEA-464 cancel-
  path span capture would be silently dead.

CI configs (.config/nextest.toml + .github/workflows/*):

- `package(seal-daemon)` filters flip to `package(seal-runtime)`
  for the integration + sccache + sandbox_command_run BDD
  buckets.
- Test group `seal-daemon-integration` → `seal-runtime-integration`
  — the required CI check is on the Tests job, not any test
  grouping, so the rename is safe.
- `RUST_LOG=seal_daemon::cancel=debug,seal_daemon=info` becomes
  `RUST_LOG=seal_runtime::cancel=debug,seal_runtime=info,seal_daemon=info`
  in ci.yml + nightly.yml so the SEA-464 cancel-path span
  capture still fires (the cancel.rs module lives in
  seal-runtime now).
- live-tests.yml + test-bwrap-versions.yml flip
  `cargo nextest -p seal-daemon` → `cargo nextest -p seal-runtime`
  for the llm-live-test build + bwrap dispatcher test.
- `.github/path-filters/llm.yml` paths repointed:
  `seal-daemon/src/{llm,engine,grpc}/**` →
  `seal-runtime/src/{llm,engine,grpc}/**`; `serve.rs` stays in
  seal-daemon.

Justfile recipe updates:

- `test-panes`: add a `seal-runtime` pane next to seal-daemon.
- `nextest-live-llm`: `-p seal-daemon --lib` →
  `-p seal-runtime --lib`.
- `timing-budgets`: `-p seal-daemon` → `-p seal-runtime`.
- `--features seal-daemon/test-hooks` continues to work
  unchanged via the new forwarding feature definition in
  seal-daemon's Cargo.toml:
    `test-hooks = ["seal-runtime/test-hooks"]`
  So `just build-tests-bins-ci` (which uses
  `--features seal-daemon/test-hooks`) lights up the runtime
  test-hooks transparently.

seal-runtime build.rs:

- Stamp emitted by `seal_build_info::collect("seal-runtime")`
  (was `"seal-daemon"`). Header comment rewritten to talk
  about the runtime, not the daemon.

Docs sweep (codesmith review item 3):

- docs/TESTING.md: "Daemon integration" header retitled to
  "Runtime integration"; path filter description updated.
- docs/INTRODUCTION.md: "How does X work?" links repointed to
  seal-runtime/src/* (llm/, scope/, tools.rs, engine/, grpc/).

The structural follow-up codesmith flagged (item 2 — move most
of `serve_with_shutdown` into seal-runtime to break the
seal-runtime → seal-daemon dev-dep cycle) is tracked separately
and not done here; it ramifies through the downstream colony
stack via the colony-side serve.rs registration site, so
splitting it lets this PR ship while the follow-up gets a
clean diff.

Verification:

- `cargo nextest run -p seal-runtime --features test-hooks`:
  843 tests run, 843 passed, 21 skipped.
- `cargo nextest run -p seal-e2e-tests --features seal-daemon/test-hooks`:
  42 tests run, 42 passed, 5 skipped.
- `cargo fmt --all --check`: clean.
- `cargo clippy --workspace --features seal-daemon/test-hooks --tests --bins`:
  clean.
- `cargo check --workspace`: clean.

Pre-existing failures unaffected:
  - seal-sandbox::bwrap_integration (NixOS path discovery,
    SEA-704).
  - integration tests that need `--features test-hooks` for
    `FakeClock` still need it (no change in behavior; the
    workspace check without `--features` still surfaces the
    same warning as before the extraction).

Co-Authored-By: seal <noreply@sealedsecurity.com>
…Mode A push queue + e2e tests (SEA-700)

Lands the seal Colony v0 prototype as a new crate seal/crates/seal-colony/
+ a seal colony subcommand on seal-cli.

Features:
- 13-verb CLI surface (5 worker-callable, 8 supervisor-only) with
  permission gate keyed off SEAL_SESSION_ID env
- SQLite store at ~/.seal/colony/colony.db, 8 tables (agents, tasks,
  agent_runs, push_queue_items, supervisor_context, supervisor_actions,
  conflict_zones, messages) with FK enforcement + WAL journaling
- Mode A push queue end-to-end (worker ready → supervisor approve / reject
  with auto-logged audit trail)
- SQL escape hatch with DDL rejection + auto-log on success/failure +
  prefilled Linear-issue reminder for missing CLI verbs
- Worker→supervisor message passing as the reroute mechanism for
  permission-denied verbs
- Bring-your-own supervisor skill (~/.seal/colony/skills/seal-supervisor.md);
  Colony refuses to start without a configured + readable skill file
- ratatui+crossterm TUI with agent list + push queue views, j/k/Tab
  navigation, a/r approve/reject keybinds

Test coverage: 36 unit + 14 e2e = 50 tests, all passing. e2e tests
exercise the full BDD layer (Mode A flow, deny paths with no-leak
verification, bootstrap exception, escape-hatch auto-log across
ok/err/DDL paths).

v0.2 follow-ups (SEA-701, separate bookmark): seal-daemon spawn glue,
multi-machine (remote daemon over SSH tunnel), split-pane TUI,
Colony-as-seal-daemon-module migration.

Co-Authored-By: seal <noreply@sealedsecurity.com>
Adds the gRPC schema for seal Colony's RPC surface to seal.proto.
14 unary RPCs + 1 server-streaming (WatchColony) covering the
existing CLI surface verbatim plus the live-update channel that
unblocks multi-terminal coordination + Colony-in-seal-daemon.

No service implementation yet — that lands in phase 2/3 along
with the seal-daemon wiring. Phase 3 swaps seal-cli's direct
dispatch for a gRPC client. The existing seal-colony crate stays
functional throughout the migration (still backs SEA-700 v0
dogfood).

Co-Authored-By: seal <noreply@sealedsecurity.com>
…phase 2a/5)

Pure data-returning functions in src/ops.rs for every Colony
action. CLI handlers become thin formatters over ops. Same shape
the gRPC ColonyService impl (phase 2b) will use, so the service
won't duplicate logic.

Red→Green: 19 tests in tests/ops.rs lock the contract of every
op (spawn / assign / ready / approve / reject / retire / status
/ list_agents / context_set / context_show / log_append /
log_show / message / attention / sql_escape_hatch) — written
first, watched compile-fail, then ops module landed and tests
went green.

Test surface: 36 unit + 19 ops + 14 e2e = 69 tests passing,
no behavior regression.

Co-Authored-By: seal <noreply@sealedsecurity.com>
…ng (SEA-711, phase 2b/5)

ColonyServiceImpl in seal-colony/src/service.rs implements the
14 unary RPCs from the proto schema (phase 1) on top of the ops
layer (phase 2a). seal-daemon's serve.rs registers
`ColonyServiceServer` alongside `SealDaemonServer` on the same
UDS socket; Colony pool open is best-effort so a stale colony.db
doesn't break regular sessions.

Permission gate reads the caller's session id from the
`x-seal-session-id` request metadata, classifies via
permissions::caller_role, and bails with PERMISSION_DENIED on
supervisor-only verb attempts by workers / unknown sessions.

WatchColony returns an empty stream as a placeholder; the
broadcast bus + real event emission land in phase 2c.

Red→Green: 12 BDD tests in tests/service.rs exercising the full
gRPC surface through tonic's in-memory duplex channel (no real
socket). Tests written first, watched fail (no `service`
module), then impl landed and they went green on first run.

Coverage: 36 unit + 19 ops + 14 cli-e2e + 12 service = 81
tests passing, no regressions across affected crates.

Co-Authored-By: seal <noreply@sealedsecurity.com>
…ase 2c/5)

ColonyServiceHandle now owns a `tokio::sync::broadcast::Sender<ColonyEvent>`
(capacity 256). Every mutation RPC (spawn / retire / ready /
approve / reject / context_set / log_append / message / attention)
emits the appropriate ColonyEvent variant after persisting state.

WatchColony returns a server-streaming subscription backed by a
fresh broadcast receiver per client. Lagged subscribers see
`Status::data_loss` once and can reconnect via Status to rebuild
state — broadcast has no replay buffer, so SQLite stays the
store of record + events are the fast-path delta channel.

Red→Green: 3 BDD tests in tests/watch.rs (single subscriber,
two-subscriber broadcast, multi-event ordering). All failed
initially because phase 2b's WatchColony returned
`futures::stream::empty()`. After wiring the bus + emit calls,
all 3 went green.

Coverage: 36 unit + 19 ops + 14 cli-e2e + 12 service + 3 watch
= 84 tests passing.

Co-Authored-By: seal <noreply@sealedsecurity.com>
…A-711, phase 3/5)

Production `seal colony <verb>` now connects to the running
seal-daemon over its UDS socket and routes every verb through
ColonyService. Daemon owns the single SQLite pool; every CLI
invocation is a thin gRPC client. Lays the groundwork for
multi-terminal coordination + remote-daemon (SEA-701) work
without further CLI-side changes.

New surface:

- `seal_colony::client::ColonyClient` — typed wrapper around
  tonic's ColonyServiceClient. `connect_uds(path, session)` for
  production, `from_channel(channel, session)` for tests.
  Carries SEAL_SESSION_ID as `x-seal-session-id` metadata on
  every call.
- `seal_colony::cli::dispatch_via_client(client, cmd)` —
  printer layer over the client. Identical UX to the legacy
  `dispatch_with_session_and_pool` direct path.
- `seal_colony::cli::dispatch(cmd)` (production entry point):
  reads SEAL_SESSION_ID env, connects to
  `seal_utils::paths::socket_path()`, calls
  `dispatch_via_client`. Surfaces a clear error when the daemon
  isn't running.

Legacy direct-DB path (`dispatch_with_session{_and_pool}`)
remains for the 14 cli-e2e tests + TUI's current in-process
model. The TUI's switch to the client path lands in phase 4
along with WatchColony subscription rendering.

Red→Green:

- 5 BDD tests in `tests/client.rs` covering UDS connect,
  session metadata threading, status/list_agents snapshots,
  health-check shape, missing-socket error path. Written
  first, watched fail (no `client` module), then impl landed
  + tests went green.
- 4 BDD tests in `tests/dispatch_client.rs` covering the
  printer routing for spawn / assign / permission-deny /
  context+log roundtrip via dispatch_via_client. Written first,
  failed (no `dispatch_via_client` function), then impl landed.

Coverage: 36 unit + 19 ops + 14 cli-e2e + 12 service + 3 watch
+ 5 client + 4 dispatch-via-client = 93 tests passing.

Co-Authored-By: seal <noreply@sealedsecurity.com>
…phase 4/5)

Switch the Colony TUI off the 10Hz SQLite-poll model onto the
gRPC ColonyService. The TUI now connects to seal-daemon over
the same UDS the CLI uses, hydrates its state from one Status
snapshot at startup, and streams per-mutation updates from
WatchColony.

New TuiState API (BDD/TDD tests in tests/tui_state.rs land
first, Red):

- hydrate_from_status(StatusSnapshot): replaces agent +
  push-queue lists, clamps selection indices.
- apply_event(ColonyEvent): pure per-event mutator covering
  AgentUpserted (upsert by codename), AgentRetired (remove +
  clamp), QueueUpserted (upsert by bookmark). LogAppended /
  ContextSet / MessageDelivered are silent no-ops for v0 — TUI
  has no pane for them yet, the next pane that ships adds the
  handler.

State storage shape changes from agents_db::AgentRow /
pq_db::PushQueueItem to ops::AgentSnapshot /
ops::PushQueueSnapshot — same fields, but matches the shape
the gRPC client returns, so no extra mapping layer.

tui::mod.rs event loop:

- Initial Status hydration before entering the loop.
- Non-blocking stream drain each frame (0ms timeout race on
  stream.next()). Stream EOF / error → drop, refresh from
  Status, attempt resubscribe so transient daemon restarts
  recover automatically.
- Keypress handlers route approve / reject through ColonyClient
  instead of direct pq_db calls — TUI inherits SEAL_SESSION_ID
  from the supervisor's surrounding shell so the daemon's
  permission gate sees it as the supervisor.

The legacy state.refresh(&pool) + tui::handle_key direct pool
write path is removed entirely. The dispatch_with_session_and_pool
CLI fallback stays for the integration-test path; phase 5
cleans that up after end-to-end dogfooding.

Tests: 13 new in tests/tui_state.rs (hydrate, apply_event for
each visible event kind, selection clamping, kindless events,
non-visible event kinds, existing behaviors post-refactor).
Workspace total: 106 passing (93 + 13).

Co-Authored-By: seal <noreply@sealedsecurity.com>
…11, phase 5/5)

Drop `dispatch_with_session` and `dispatch_with_session_and_pool`
from the CLI module — production has been on `dispatch_via_client`
since phase 3, and the TUI moved to `ColonyClient` in phase 4, so
nothing in the seal binary tree still touches the in-process pool
path. The 13 per-verb `handle_*(pool, ...)` formatters that only
existed to support that path go with them.

The remaining CLI surface in `cli/mod.rs` is now:

- `dispatch(cmd)` — sync entry that boots a runtime, connects to
  the daemon over UDS, threads SEAL_SESSION_ID into request
  metadata, dispatches via `dispatch_via_client`.
- `dispatch_via_client(client, cmd)` — async, takes a connected
  `ColonyClient`. Match on the verb, format the response, print.
- `client_handle_status` / `client_handle_agents` — the only
  two verbs with non-trivial multi-line render logic, kept as
  helpers so the main match arm stays scan-able.

File shrinks from 793 → 450 lines. The module-level doc updates
to reflect that permission gating is now server-side (in the
daemon's ColonyService), not client-side.

Test migration (cli_e2e.rs):

The 14 tests in cli_e2e.rs used to drive `dispatch_with_session_and_pool`
against an in-memory pool directly. They now drive `dispatch_via_client`
against an in-memory tonic Channel backed by a real `ColonyServiceImpl`
over a `tokio::io::duplex` pair — same harness pattern as the
service.rs + watch.rs tests. This is strictly stronger coverage:
the tests now exercise the full production path
(CLI → ColonyClient → tonic → ColonyServiceImpl → ops → SQLite)
instead of the legacy in-process pool path. All 14 still pass
unchanged; the asserts still read state directly off the shared
SQLite pool (the harness keeps the pool handle alongside the
client).

Two helpers introduced for the migration:
- `setup()`: in-memory pool + service + duplex Channel + ColonyClient.
- `dispatch_as(client, session_id, cmd)`: stamps `session_id` on
  the client and runs `dispatch_via_client`. Session ids are
  sticky on `ColonyClient` so each test step that needs a
  specific identity threads its session through this helper.

`permissions::require_supervisor` stays — `service.rs` still uses
it as the role classifier on incoming RPCs.

Workspace total: 106 tests passing (unchanged from phase 4 —
this is pure surface-area reduction with stronger underlying
coverage).

Co-Authored-By: seal <noreply@sealedsecurity.com>
@mattwilkinsonn mattwilkinsonn force-pushed the sea-711-colony-in-daemon--poseidon branch from 67bb475 to 48340da Compare May 31, 2026 03:54
@github-actions

Copy link
Copy Markdown

Docs preview: https://fd89835b.seal-docs.pages.dev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant