Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions docs/agent-iam-strategy.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,51 @@ Sequence: ship working code → grow vendor adoption → THEN propose specs. Not

Single-act memory injection reads as "smart toy." Three acts read as "Agent IAM." See §4 for the revised Phase 1 demo.

### 2.7 The mobile-OS permission model is the product's spine (added 2026-05-31)

The clearest mental model for AgentKeys — for the parent buying a toy, for the vendor integrating it, for the engineer building it — is the **mobile-OS app-permission system**. Everyone already understands it: you install an app, it asks for the permissions it needs at first launch, you grant or deny each one, the OS enforces those grants at the syscall boundary (the app physically cannot reach your camera if you said no), and you can revoke any permission later in Settings.

AgentKeys is the same model, one layer up — for AI agents instead of mobile apps:

| Mobile OS (iOS / Android) | AgentKeys | Mechanism in our stack |
|---|---|---|
| Install an app | Onboard a new agent | `agentkeys agent create` → `agentkeys wire <runtime>` (§3.7) |
| First-launch permission prompt | Master's grant ceremony | K11 WebAuthn assertion on the master device (arch §10.7) |
| Permission categories (Camera, Location, Contacts, Mic) | Capability categories | Memory namespaces (§3.5) × services × bounds |
| Grant / deny per permission | Per-category grant | `AgentKeysScope.Scope` written on chain (arch §16.1) |
| "Allow once" / "While using" / precise-vs-approximate | Bounded grants | read-only memory; payment ≤ ¥50; specific IoT devices |
| OS enforces at the syscall boundary (app can't bypass) | **Hook enforces at the tool-call boundary** | PreToolUse hook → `permission.check` (§3.6, arch §22d) — the IAM *guarantee*, not the *tool* |
| Runtime re-prompt for a sensitive action | `permission.check` → `ask_parent` verdict | deterministic policy engine escalates |
| Revoke a permission in Settings | Parent revokes in the web UI | `cap.revoke` / `revoke_scope_with_webauthn` — the Act-3 demo |
| Per-app sandbox | Per-actor isolation | four-layer per-actor invariants (CLAUDE.md) |
| App Store review before distribution | Vendor onboarding / device pairing | arch §22c.4 |

**Why this is the spine, not just an analogy:**

1. **It tells the consumer what they're buying without saying "IAM."** Per §3.4's dual-narrative rule, a parent buying an AI toy doesn't want "Agent IAM." They want "the toy asks me before it can spend money or read the family calendar — the same way apps ask before using my camera." The mobile-OS model is the consumer pitch, already pre-installed in everyone's head.

2. **It explains why hooks (not just MCP tools) are non-negotiable.** §3.6's IAM-tool-vs-IAM-guarantee distinction *is* the mobile-OS distinction between (a) an app *politely calling* a "may I use the camera?" function it could choose to skip, and (b) the *OS intercepting* the camera syscall so the app cannot proceed without the grant. The `PreToolUse` hook is the syscall-interception layer. Without it, AgentKeys is a courtesy API; with it, AgentKeys is the OS.

3. **It makes the onboarding ceremony the product moment.** Mobile OS made "the permission prompt at first launch" the defining trust interaction of the smartphone era. AgentKeys' equivalent — the master being prompted, on their own device with biometric presence, to grant a freshly-onboarded agent its capabilities — is the moment the user feels in control. The grant ceremony (arch §10.7) is where Phase-1's demo "surprise" (§3.7) actually lands.

**Permission categories — the consumer-facing vocabulary (v0):**

| Consumer label | Maps to | Typical grant for a kids' AI toy |
|---|---|---|
| 出行记忆 Travel memory | `namespace=travel`, read | ✅ read |
| 健康记忆 Health memory | `namespace=personal` (health subset), read | ❌ none |
| 关系记忆 Relationship memory | `namespace=family`, read | ⚠️ read, parent-toggled |
| 工作记忆 Work memory | `namespace=work` | ❌ none |
| 支付能力 Payment | `service=payment`, `max_per_call` | ✅ ≤ ¥50/call, ≤ ¥200/day |
| 家居控制 IoT control | `service=iot`, device allow-list | ⚠️ specific devices only |
| 凭证访问 Credential access | `service=cred-store`, per-service | ❌ none |

Each category is a `(service, namespace, operation, bound)` tuple. The existing primitives already carry every field — services in `Scope.services`, namespaces in the cap-token's `namespaces_allowed`, bounds in `Scope.max_per_call`. What v0 lacks — and what the mobile-OS model demands — is presenting them as **per-category toggles in one onboarding screen**, with each toggle bound to its own structured grant. That extension is recorded conservatively (additive, no v0 contract change) in arch §10.7.

**You confirm; you don't configure — the AI recommends the scopes.** A parent shouldn't face a blank toggle grid. At onboarding the AI proposes a *recommended* scope set — derived from the agent's role (the classifier, #207), the master's saved policy (global config, #201), and a safe default preset — and the master just reviews, edits, and approves it with one biometric tap. It's the app-manifest, inverted: instead of the agent declaring the permissions it wants, AgentKeys *infers* the manifest and asks you to confirm. The AI recommends; **only the master's K11 assertion grants** — the recommendation authorizes nothing on its own.

**The recommendation sharpens with use; it never loosens on its own.** An optional 2–3 question setup ("who is this for?") and, over time, the master's own grant/deny history make each next agent's recommendation smarter — held in the master's audited config, advisory-only. It ratchets toward caution: sensitive categories (health, payment, credentials) keep asking every time even after past grants, and high-impact learned defaults are periodically re-confirmed. No learned preference ever widens a live scope without a fresh K11 grant. (Mechanism in arch §10.7.)

---

## 3. Four corrections that reshape architecture commitments
Expand Down
45 changes: 45 additions & 0 deletions docs/arch.md
Original file line number Diff line number Diff line change
Expand Up @@ -582,6 +582,51 @@ OS keychain when available (Linux GNOME Keyring, Windows Credential Locker). Whe
| **Master K10 + biometric presence** | Above plus: mutate scope, bind new master device, rotate K10, mint new agent omnis. Bounded to this human's actor tree. Visible on chain (sovereign mode default). Recovery (§11) revokes within ~60s. |
| **Agent K10 leaked** (sandbox host root) | Cap-mint under `O_agent_A` until link-code rotation or session-JWT TTL expiry. **Per-actor binding** prevents impersonating siblings. Cannot rebind, mutate scope, or escalate to master. PrincipalTag at STS prevents cross-agent S3 access. |

### 10.7 Permission-grant ceremony — the mobile-OS analog (added 2026-05-31)

The strategy doc ([agent-iam-strategy.md §2.7](agent-iam-strategy.md)) frames the whole product as the mobile-OS app-permission model: install → first-launch permission prompt → per-permission grant → OS-enforced at the syscall boundary → revoke in Settings. This section records how that UX maps onto the **existing** v2 primitives. It introduces no new contract, key, or wire change — it is a naming + composition layer over what §10.2, §16.1, §19, and §22d already ship.

**Mapping to shipped primitives:**

| Mobile-OS step | AgentKeys mechanism | Where it already lives |
|---|---|---|
| Install app | Agent onboarding | §10.2 link-code bootstrap |
| First-launch permission prompt | Master grant ceremony — K11 WebAuthn on master, with operator-readable intent | §10.1 K11 ceremony + intent-rendering |
| Grant / deny per permission | `set_scope_with_webauthn(operator, agent, Scope, k10_sig, k11_assertion)` | §16.1 `AgentKeysScope` |
| Bounded grant (≤ ¥50, read-only) | `Scope.{read_only, max_per_call, max_per_period, max_total, payment_k11_threshold}` + cap-token `namespaces_allowed` | §16.1 + §19 cap shape |
| OS enforces at the syscall boundary | **`PreToolUse` hook → `permission.check`** — the IAM *guarantee* | §22d hooks-first |
| Runtime re-prompt | `permission.check → ask_parent` | deterministic policy engine (§22d tool layer) |
| Revoke in Settings | `revoke_scope_with_webauthn` / `cap.revoke` | §16.1 + Act-3 demo |
| Per-app sandbox | per-actor isolation | §14, §17 |

**The one-step-vs-two-step UX nuance.** Today §10.2 (agent bootstrap via link-code) and the scope grant (`set_scope_with_webauthn`) are *two* master actions. The mobile-OS model unifies them into *one* onboarding moment: when the master runs `agentkeys agent create` (or `agentkeys wire <runtime>` per strategy §3.7), the CLI presents the requested capabilities and captures a single K11 assertion authorizing both the device binding and the initial scope grant. This is a CLI/UX composition of two existing chain calls — **not** a protocol change. The two underlying transactions stay separate and independently auditable on chain.

**Identified gap — per-category structured grants (additive, future).** The current `AgentKeysScope.Scope` (§16.1) expresses a grant as a *flat* `services[]` list, *one* global `read_only` bool covering all services, and *payment-specific* bounds (`max_per_call` etc.). The mobile-OS model wants *per-category* grants — "read-only travel memory AND no health memory AND payment ≤ ¥50 AND these two IoT devices" — which needs each category to carry its own operation + bound rather than a single global `read_only` + payment-only caps.

The conservative path is **additive**: a future `Scope.grants` field — `PermissionGrant[] { category, operation, bound }` — layered alongside the existing fields, not replacing them. Existing fields stay for back-compat (the cap verifier already ignores unknown fields gracefully by design); v0 ships the flat model; the structured model lands when the parent-control UI needs per-category toggles (the mobile-OS Settings screen). **No v0 contract change is implied by this section** — it records the direction so the eventual `Scope` extension is a known, bounded addition rather than a surprise rewrite. Tracked against the Phase-4 capability-depth work (strategy §5 Phase 4).

**The scope recommender (advisory — composes, doesn't gate).** Onboarding pre-fills the grant from three existing master-side inputs, then asks the master to confirm:
- **`presets.rs`** (#207 default-preset bootstrap) — the deterministic starter scope.
- **`agentkeys-worker-classify`** (#207) — classifies the new agent's role → candidate capability categories.
- **master/global `config`** (#201, master-self-only + audited) — the policy ceiling (caps, sensitive-namespace denylist) plus, accrued over time, the master's grant/deny history. An optional 2–3 question profile seeds it; it is never a precondition for first use.

The recommender emits a proposed `Scope.grants` (the deferred field above) — so it is the *consumer* that motivates that extension. **Invariants:** it runs daemon-side in the master's trust domain (never broker-decides-for-you); it is LLM/heuristic-assisted and so lives on the *advisory* side of §3.6 — the deterministic cap-mint + `PreToolUse` gate is untouched; and a recommendation grants nothing, **only the master's K11 assertion does**. The learning loop is **asymmetric** — sensitive categories (health, payment, credentials) re-prompt every time regardless of history, and high-impact learned defaults resurface for periodic re-confirmation — so accrued preferences can never silently widen a live scope.

**arch.md compatibility check (no contradictions, verified 2026-05-31):**
- ✅ §10.2 link-code bootstrap unchanged — the ceremony composes the existing two chain calls; it alters neither.
- ✅ §16.1 `AgentKeysScope` contract unchanged — the per-category `grants` field is explicitly deferred + additive; the shipped struct is untouched.
- ✅ §22d hooks-first enforcement unchanged — this section names the `PreToolUse` hook as the syscall-analog enforcement point; it introduces no competing enforcement path.
- ✅ §3.6 IAM-guarantee posture unchanged — the recommender is advisory (LLM/heuristic); cap-mint + the `PreToolUse` gate stay deterministic, and only the master's K11 assertion grants.
- ✅ §6.3 identity ≠ actor ≠ capability separation unchanged — grants attach to the (operator, agent) scope edge, exactly as `AgentKeysScope` already keys them.
- ✅ Canonical names (§5) — "permission-grant ceremony" and "permission category" are UX terms layered on existing canonical names (`agent_omni`, `cap-token`, `Scope`); no new identity/key spelling introduced.

**Cross-references:**
- [strategy §2.7](agent-iam-strategy.md) — consumer-facing framing + the permission-category vocabulary
- [§10.2](#102-agent-bootstrap-link-code-only--single-path) — the bootstrap the ceremony composes with the scope grant
- [§16.1](#161-contracts) — the `AgentKeysScope.Scope` struct the grant writes
- [§22d](#22d-iam-guarantee-delivery--hooks-first-proxy-fallback) — the hook enforcement that turns the grant into an IAM guarantee
- [#110](https://github.com/litentry/agentKeys/issues/110) — parent-control UI where per-category toggles surface

---

## 11. Recovery — M-of-N device quorum (no anchor wallet, no seed phrase)
Expand Down
Loading