feat: agentkeys-gate — in-path CustomLLM gate (Plan A #305 pieces 2+5) + GateTurn audit op_kind#308
Open
hanwencheng wants to merge 2 commits into
Open
feat: agentkeys-gate — in-path CustomLLM gate (Plan A #305 pieces 2+5) + GateTurn audit op_kind#308hanwencheng wants to merge 2 commits into
hanwencheng wants to merge 2 commits into
Conversation
…) + GateTurn audit op_kind New crate `agentkeys-gate`: an OpenAI-compatible POST /v1/chat/completions endpoint (the Volcano RTC CustomLLM target) that per turn does cap-check + memory-inject + audit behind a swappable EngineAdapter — Plan A of docs/plan/ai-device-platform.md §3 (pieces 2 + 5). - Reuses the shipped backend verbatim (cap-mint -> per-actor STS -> memory worker -> audit) via the shared agentkeys-backend-client::BackendClient; owns NO new wire shape (#203 one-owner rule). - The EngineAdapter is the only swappable part (A ⊂ B): an openai-compatible upstream (Volcano -> Doubao) plus a local `echo` engine that runs the gate on real hardware before Doubao creds exist. - No silent fallback: a broken broker/worker fails the turn loud; a per- namespace 403 is a recorded "denied", a 404/empty is "empty". A 5xx upstream body is logged for the operator but never echoed to the vendor caller. Registers GateTurn = 90 in the canonical audit op_kind table (agentkeys-core + arch.md §15.3a). The gate-turn row records the turn-level facts; the memory reads stay audited data-plane-side as MemoryGet (#229), so a pure-chat turn is still on the ledger. Deferred (plan §8 sequencing + hardware): Tuya firmware (piece 1), parent- control management view (piece 4), setup-broker-host deploy wiring, streaming, and all spend (§4 prerequisite: single-use caps + mint-time budgeting). Gates: workspace build + clippy -D warnings + fmt + no-env-mutation; 21 gate tests + the core audit roundtrips.
- docs/operator-runbook-gate.md — 3-level test runbook (local echo smoke test → real backend memory-inject + GateTurn audit → Doubao), with verified Part-1 outputs, negative checks, a troubleshooting table, and the full config reference. - crates/agentkeys-gate/IMPLEMENTATION_LOG.md — narrative of what was built and why each decision was made (per-turn flow, module map, the engine red line, the no-silent-fallback classification, the GateTurn op_kind rationale, the review-fix, test coverage, deferrals). - README links both.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements Plan A pieces 2 + 5 from
docs/plan/ai-device-platform.md§3 (the Volcano MVP, #305) — a newagentkeys-gatecrate: an OpenAI-compatiblePOST /v1/chat/completionsendpoint (Volcano RTC CustomLLM mode target) that, per LLM turn, does cap-check → memory-inject → engine completion → audit, behind a swappable engine. It is the sibling ofagentkeys-mcp-serverand reuses the shipped backend (piece 3) verbatim.What landed
crates/agentkeys-gate— the in-path gate:CapMintOp::MemoryGetcap perservice_memory(ns), read the memory worker, select lines withagentkeys-memory-engine, prepend asystemmessage → swappableEngineAdapter→ audit.engine.rs):openai-compatibleupstream (Volcano → Doubao) + a localechoengine that runs the gate on real hardware before Doubao creds exist (Plan A §3's "prove the gate in-path on real hardware").agentkeys-backend-client::BackendClient(Shared broker/worker client crate — collapse the duplicated chain impls (drift fix) #203 one-owner rule).denied, a 404/empty =empty. A 5xx upstream body is logged for the operator but never echoed to the vendor caller.GateTurn = 90registered end-to-end (agentkeys-coreop_kind / typed body / decode + arch.md §15.3a). Records turn-level facts; the memory reads stay audited data-plane-side asMemoryGet(Worker-side durable audit for memory + cred data-plane ops (store/fetch) #229), so a pure-chat turn is still on the ledger.What did NOT land (deferred — per the plan's own §8 sequencing + constraints)
setup-broker-host.shsurface (new systemd unit + nginx vhost for the CustomLLM endpoint); deploy follow-up.Review
Adversarial multi-agent review (4 reviewers × verify) surfaced 1 real finding — a 5xx broker body could be echoed to the vendor caller. Fixed (log full for operator, return safe status category only) with a regression test.
Gates
Workspace build ✓ ·
cargo clippy --workspace --all-targets -- -D warnings✓ ·cargo fmt --all -- --check✓ ·check-no-env-mutation-in-tests.sh✓ · 21 gate tests + the core audit roundtrips ✓.To test this (deploy surfaces)
agentkeys-corechanged (links into the broker) → rebuilds on next deploy, no behavior change; the gate binary is not yet a deploy surface (see deferred).agentkeys-corelinks into the daemon → rebuild viadev.shto pick up theGateTurnop_kind in audit decode.Smoke-test the new crate locally with the echo engine (no broker/Doubao needed):
cargo run -p agentkeys-gate -- --allow-anonymous --engine echo, then POST an OpenAI chat-completions request to:8077/v1/chat/completions.🤖 Generated with Claude Code