Skip to content

feat: agentkeys-gate — in-path CustomLLM gate (Plan A #305 pieces 2+5) + GateTurn audit op_kind#308

Open
hanwencheng wants to merge 2 commits into
mainfrom
claude/infallible-euler-ac76fa
Open

feat: agentkeys-gate — in-path CustomLLM gate (Plan A #305 pieces 2+5) + GateTurn audit op_kind#308
hanwencheng wants to merge 2 commits into
mainfrom
claude/infallible-euler-ac76fa

Conversation

@hanwencheng

Copy link
Copy Markdown
Member

Summary

Implements Plan A pieces 2 + 5 from docs/plan/ai-device-platform.md §3 (the Volcano MVP, #305) — a new agentkeys-gate crate: an OpenAI-compatible POST /v1/chat/completions endpoint (Volcano RTC CustomLLM mode target) that, per LLM turn, does cap-check → memory-inject → engine completion → audit, behind a swappable engine. It is the sibling of agentkeys-mcp-server and reuses the shipped backend (piece 3) verbatim.

What landed

  • crates/agentkeys-gate — the in-path gate:
    • Per turn: resolve identity → mint a CapMintOp::MemoryGet cap per service_memory(ns), read the memory worker, select lines with agentkeys-memory-engine, prepend a system message → swappable EngineAdapter → audit.
    • A ⊂ B seam (engine.rs): openai-compatible upstream (Volcano → Doubao) + a local echo engine that runs the gate on real hardware before Doubao creds exist (Plan A §3's "prove the gate in-path on real hardware").
    • No re-typed wire body — routes everything through the shared agentkeys-backend-client::BackendClient (Shared broker/worker client crate — collapse the duplicated chain impls (drift fix) #203 one-owner rule).
    • No silent fallback: a broken broker/worker fails the turn loud (502); a per-namespace 403 = recorded denied, a 404/empty = empty. A 5xx upstream body is logged for the operator but never echoed to the vendor caller.
  • New audit op_kind GateTurn = 90 registered end-to-end (agentkeys-core op_kind / typed body / decode + arch.md §15.3a). Records turn-level facts; the memory reads stay audited data-plane-side as MemoryGet (Worker-side durable audit for memory + cred data-plane ops (store/fetch) #229), so a pure-chat turn is still on the ledger.
  • Docs: arch.md component inventory + crate tree, the plan-doc §3 status block, a crate README.

What did NOT land (deferred — per the plan's own §8 sequencing + constraints)

  • Piece 1 — Tuya T5 firmware → Volcano RTC (needs the hardware; firmware spike).
  • Piece 4 — parent-control management view wired to the gate's backend (follow-up).
  • Deploy wiring — the gate is not yet a setup-broker-host.sh surface (new systemd unit + nginx vhost for the CustomLLM endpoint); deploy follow-up.
  • Memory write-back, streaming, and all spend (plan §4 blocks spend on single-use caps + mint-time budgeting).

Review

Adversarial multi-agent review (4 reviewers × verify) surfaced 1 real finding — a 5xx broker body could be echoed to the vendor caller. Fixed (log full for operator, return safe status category only) with a regression test.

Gates

Workspace build ✓ · cargo clippy --workspace --all-targets -- -D warnings ✓ · cargo fmt --all -- --check ✓ · check-no-env-mutation-in-tests.sh ✓ · 21 gate tests + the core audit roundtrips ✓.

To test this (deploy surfaces)

Surface Touched? Action
Remote broker host Shared agentkeys-core changed (links into the broker) → rebuilds on next deploy, no behavior change; the gate binary is not yet a deploy surface (see deferred). none required for this PR
Local daemon + web app agentkeys-core links into the daemon → rebuild via dev.sh to pick up the GateTurn op_kind in audit decode. local rebuild only
Chain contracts No none
Cloud (AWS / DNS / IAM) No none

Smoke-test the new crate locally with the echo engine (no broker/Doubao needed): cargo run -p agentkeys-gate -- --allow-anonymous --engine echo, then POST an OpenAI chat-completions request to :8077/v1/chat/completions.

🤖 Generated with Claude Code

…) + GateTurn audit op_kind

New crate `agentkeys-gate`: an OpenAI-compatible POST /v1/chat/completions
endpoint (the Volcano RTC CustomLLM target) that per turn does cap-check +
memory-inject + audit behind a swappable EngineAdapter — Plan A of
docs/plan/ai-device-platform.md §3 (pieces 2 + 5).

- Reuses the shipped backend verbatim (cap-mint -> per-actor STS -> memory
  worker -> audit) via the shared agentkeys-backend-client::BackendClient;
  owns NO new wire shape (#203 one-owner rule).
- The EngineAdapter is the only swappable part (A ⊂ B): an openai-compatible
  upstream (Volcano -> Doubao) plus a local `echo` engine that runs the gate
  on real hardware before Doubao creds exist.
- No silent fallback: a broken broker/worker fails the turn loud; a per-
  namespace 403 is a recorded "denied", a 404/empty is "empty". A 5xx upstream
  body is logged for the operator but never echoed to the vendor caller.

Registers GateTurn = 90 in the canonical audit op_kind table (agentkeys-core +
arch.md §15.3a). The gate-turn row records the turn-level facts; the memory
reads stay audited data-plane-side as MemoryGet (#229), so a pure-chat turn is
still on the ledger.

Deferred (plan §8 sequencing + hardware): Tuya firmware (piece 1), parent-
control management view (piece 4), setup-broker-host deploy wiring, streaming,
and all spend (§4 prerequisite: single-use caps + mint-time budgeting).

Gates: workspace build + clippy -D warnings + fmt + no-env-mutation; 21 gate
tests + the core audit roundtrips.
- docs/operator-runbook-gate.md — 3-level test runbook (local echo smoke test
  → real backend memory-inject + GateTurn audit → Doubao), with verified Part-1
  outputs, negative checks, a troubleshooting table, and the full config
  reference.
- crates/agentkeys-gate/IMPLEMENTATION_LOG.md — narrative of what was built and
  why each decision was made (per-turn flow, module map, the engine red line,
  the no-silent-fallback classification, the GateTurn op_kind rationale, the
  review-fix, test coverage, deferrals).
- README links both.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant