bbingz · bbingz · Jun 19, 2026 · Jun 19, 2026 · Jun 19, 2026 · Jun 19, 2026
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -5,13 +5,13 @@
   },
   "metadata": {
     "description": "Polycli host adapters for Claude Code and related agent CLIs",
-    "version": "0.6.24"
+    "version": "0.6.25"
   },
   "plugins": [
     {
       "name": "polycli",
       "description": "Claude Code adapter for the shared polycli companion",
-      "version": "0.6.24",
+      "version": "0.6.25",
       "source": "./plugins/polycli"
     }
   ]

diff --git a/.github/plugin/marketplace.json b/.github/plugin/marketplace.json
@@ -5,13 +5,13 @@
   },
   "metadata": {
     "description": "Polycli marketplace for GitHub Copilot CLI",
-    "version": "0.6.24"
+    "version": "0.6.25"
   },
   "plugins": [
     {
       "name": "polycli-copilot",
       "description": "Run the shared polycli companion from GitHub Copilot CLI",
-      "version": "0.6.24",
+      "version": "0.6.25",
       "source": "./plugins/polycli-copilot"
     }
   ]

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,6 +6,24 @@ Separate from `docs/release.md` (release-focused) and `docs/archive/session-memo
 
 ---
 
+## 2026-06-19 — Claude — docs: cc-X domestic-model endpoint recipes (Path-B docs + reference data)
+
+- Added `docs/cc-x-endpoints.md` (human reference) + `docs/cc-x-recipes.json` (machine-readable source of truth) encoding the cc-X pattern: point the EXISTING `claude` runtime (BYOK) or `opencode` (OpenAI-compatible) at a domestic vendor's Anthropic-compatible endpoint via `ANTHROPIC_BASE_URL` + `ANTHROPIC_AUTH_TOKEN` + `ANTHROPIC_MODEL`. Covers 9 entries across 7 PRC core labs (MiniMax, Moonshot Kimi, Zhipu GLM, Alibaba Qwen, DeepSeek, ByteDance Doubao, StepFun, Baidu Qianfan, Tencent) with per-vendor base URL, model-id family, native-CLI grouping, context-window (`autoCompactWindow`), caching note, and a `source` URL+date per entry.
+- Encoded the operational gotchas: silent prompt-cache degradation on shim endpoints (dual cache-breakpoint; DeepSeek is the auto-prefix-caching exception), pin a known-good Claude Code version + `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1`, size `CLAUDE_CODE_AUTO_COMPACT_WINDOW` to the model's context, marketplace (Baidu/Tencent) model-identity instability, and the PRC data-sovereignty/Entity-List gate as SEPARATE from harness choice.
+- Honest-default: marketplace/resale endpoints carry `status: "marketplace-unstable"` and `autoCompactWindow: null` (no fabricated model/version pin), mirroring the gemini attempted-vs-used-model caveat in `docs/model-fallback-policy.md`. Enforced by `scripts/validate-cc-x-recipes.mjs` (a pure validator modeled on `validate-fixture-metadata.mjs`) + `scripts/tests/validate-cc-x-recipes.test.mjs` (auto-joined by the npm-test glob); added `npm run validate:cc-x-recipes` for standalone use.
+- Documented that cc-X is NOT a polycli provider/adapter/runtime — it rides the existing runtimes via standard env vars; the `claude -p` path forwards them via full `process.env` inheritance, while the tmux allowlist (`CLAUDE_TMUX_ENV_EXACT`) forwards the `ANTHROPIC_*` trio but NOT the `CLAUDE_CODE_*` knobs (documented, not fixed). Clarified that the polycli `minimax`/`mmx-cli` provider is a stateless text/media call, not the MiniMax cc-X coding path.
+- Cross-linked from `docs/provider-paths.md` (new subsection + Official-references bullet) and `docs/polycli-v1-public-surface.md` (one out-of-contract sentence). Recorded the no-adapter decision in `docs/roadmap.md` as closed Q10 + an Explicit-non-goals bullet. Zero runtime/production-path code change; `claude.js` env behavior left untouched by design. Verification: `node scripts/validate-cc-x-recipes.mjs` ok (9 entries), `node --test scripts/tests/validate-cc-x-recipes.test.mjs` 5/5, `npm test` + `npm run release:check` green. Snapshot facts are 2026-06-19; the validator guards structure + source-anchoring, not current-truth.
+
+## 2026-06-19 — Claude — adversarial re-verification of the workflow-review remediation
+
+- Independently re-verified the committed remediation sweep (d272042 + 03ae92d) with a Workflow fan-out (9 adversarial auditors -> double-refutation -> completeness critic). 18 raw findings -> 7 confirmed + 1 critic-confirmed; 11 refuted. Confirmed the prior fixes are sound and re-ran full validation (the prior round's open residual #1): `npm test` 544/544, `npm run release:check` exit 0.
+- Closed residual #3 (state-root permissions) with real-filesystem evidence: under permissive umask 000, stateRoot/stateDir/jobsDir resolve to 0700 and state.json/job-config to 0600, enforced by explicit chmod (not umask). Characterized residual #4 (orphan `<jobId>.json` result files leak after MAX_JOBS pruning) as a PRE-EXISTING latent issue — `removeJobFile` is a dead export and the old code pruned identically — so it is out of scope for this remediation and left flagged, not fixed.
+- Fixed 2 confirmed regressions introduced by the remediation: (1) the opencode host adapter threw on exit code 2, but 2 is the companion's documented soft signal (`health` with no healthy provider, `status --wait` timeout) that still emits a valid JSON envelope on stdout — extracted `isHardCompanionFailure(status)` so exit 2 returns the envelope while exit 1/4/5/crash still reject; (2) `cancelJob` ran `cleanupRuntimePaths` (which deletes a review job's live cwd via cleanupPaths) BEFORE killing the worker — reordered to kill first, then clean up, and skip the runtime-path deletion entirely when the kill fails (worker may still be alive).
+- Fixed 2 confirmed incomplete fixes: (1) Grok `SUCCESS_STOP_REASONS` omitted `MaxTokens`, so a truncated-but-visible answer was wrongly marked ok=false — added maxtokens/max_tokens/length (grok-build's real StopReason enum is {EndTurn, MaxTokens, MaxTurnRequests, Refusal, ToolUse, Cancelled}, verified against the installed binary); refusal/cancelled/tool_use/max_turn_requests stay non-success; (2) the run-ledger append path created `~/.polycli/state/<slug>` world-traversable (0o755) via the mode-less ensureParentDir on the run_started event that fires before any other state write — `appendRunLedgerEvent` now calls `ensureStateDir` first to land it 0o700.
+- Closed 4 confirmed test gaps (all mutation/RED-proven): pre-existing-0755 dir hardening test for `ensureStateDir` (state-1); state-dir-0700-after-append-only test for the run-ledger path (pwp-2, RED-proven); Grok non-success-stopReason-ALONE failure tests for both parseGrokJsonResult and runGrokPromptStreaming plus a MaxTokens-success test (test-1 + grok-1, RED-proven); sync `runProviderPrompt` explicit-model-before-default fallback test mirroring the streaming case (qwen-model-1); new `scripts/tests/opencode-host.test.mjs` pinning the exit-2 soft-signal contract (oc-status-1).
+- All changes respect the Path B architecture boundary: no shared runtime base class, no provider parser promotion into polycli-utils, timing four-state untouched, cleanupPaths still sourced only from internal review temp dirs.
+- Verification: focused RED/GREEN proofs for grok-1 and pwp-1 (reverting each fix turns its new test red); focused suite 66/66; `npm test` 544/544 (535 + 9 new tests); `npm run release:check` exit 0 (plugin bundles 5, fixture metadata 17, codex adapter 5; one tmux.jsonl ENOENT flake on the first run was the known full-suite-parallel-load flake — claude.test.js passes 28/28 in isolation, and the re-run was clean). Not published; current unreleased workspace work after v0.6.24.
+
 ## 2026-06-16 — Codex — Grok fixture residual cleanup
 
 - Closed the remaining workflow-review residual risk by capturing a real Grok streaming fixture with `grok 0.2.51 (f4f85a6492e) [stable]`: `grok -p 'Reply with exactly HELLO_GROK_FIXTURE and nothing else.' --output-format streaming-json -m grok-build --permission-mode plan --disable-web-search --max-turns 1`.

diff --git a/docs/cc-x-endpoints.md b/docs/cc-x-endpoints.md
@@ -0,0 +1,54 @@
+# cc-X endpoint recipes (no native CLI cluster)
+
+Snapshot: 2026-06-19. Reference only — not a routing oracle, and **not a polycli runtime**. Review monthly, before release, and whenever a vendor endpoint changes. The machine-readable source of truth is [`cc-x-recipes.json`](./cc-x-recipes.json); this page is its human narration. Re-verify any row against its `source` URL before relying on it.
+
+## What cc-X is
+
+"cc-X" is the pattern of pointing a top-tier agentic-coding harness at a domestic LLM vendor's **Anthropic-compatible** endpoint with three standard environment variables:
+
+```bash
+export ANTHROPIC_BASE_URL="https://api.<vendor>/anthropic"
+export ANTHROPIC_AUTH_TOKEN="<your vendor key>"   # BYOK
+export ANTHROPIC_MODEL="<vendor model id>"
+```
+
+The harness is **Claude Code** for vendors with no competitive native coding CLI, or **opencode** when the target is an OpenAI-compatible model. cc-X wins for the no-native-CLI cluster because it is the best-AVAILABLE, co-designed, and 5-18x-cheaper scaffold — **not** because Claude Code is the highest-scoring harness (controlled ablations show other open models score higher under other harnesses; that nuance lives in the vendor system cards, not here, per the `docs/roadmap.md` Q7 source discipline that forbids citing un-sourced benchmark scores).
+
+Provider grouping:
+
+- **No competitive native coding CLI → cc-X is the path:** MiniMax, DeepSeek, Zhipu/GLM, StepFun.
+- **Has a native CLI → cc-X is a choice, not a default:** Moonshot (Kimi Code), Alibaba (Qwen Code), ByteDance (Trae / trae-agent), Baidu (Comate Zulu-CLI), Tencent (CodeBuddy Code), Xiaomi (MiMo Code).
+
+## How this rides existing polycli runtimes
+
+cc-X is **not** a polycli provider, adapter, or runtime, and this PR adds none. The recipe runs through the EXISTING `claude` runtime (BYOK env, no vendor CLI) or `opencode` (OpenAI-compatible models). polycli already forwards `ANTHROPIC_BASE_URL` / `ANTHROPIC_AUTH_TOKEN` / `ANTHROPIC_MODEL`:
+
+- On the default headless `claude -p` path, the runtime inherits the full `process.env`, so all three (and the `CLAUDE_CODE_*` knobs below) pass through unchanged.
+- On the explicit/internal tmux TUI path, the runtime forwards only an `ANTHROPIC_*` allowlist (`CLAUDE_TMUX_ENV_EXACT` in `packages/polycli-runtime/src/claude.js`). The three `ANTHROPIC_*` vars pass through there too, **but `CLAUDE_CODE_AUTO_COMPACT_WINDOW` / `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS` are NOT in that allowlist and will not reach a tmux session.** Set those two knobs on the default `claude -p` path, or export them inside the tmux session itself.
+
+There is no code to add: set the env vars and run `claude` (or polycli's `claude` provider) normally.
+
+## Operational gotchas (durable)
+
+These are the hard-won knobs the recipes encode. Per-entry specifics (base URLs, model-id families, per-vendor context window) live in `cc-x-recipes.json`.
+
+1. **Prompt caching is silently degraded on shim endpoints.** Claude Code's single cache-breakpoint produces a near-zero hit rate against MiniMax / Kimi shims, so the system prompt + tool schemas get re-billed every turn. Mitigation: use a dual cache-breakpoint and verify the gateway does not gate caching on whether the model is literally named `claude`. **DeepSeek is the exception** — it does automatic server-side prefix caching, so no client mitigation is needed.
+2. **Pin a known-good Claude Code version and set `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1`.** Claude Code auto-attaches experimental `anthropic-beta` headers that periodically 400 third-party endpoints on upgrade.
+3. **Set `CLAUDE_CODE_AUTO_COMPACT_WINDOW` to the model's real context** or Claude Code compacts prematurely. Per-model values are in `cc-x-recipes.json` (`autoCompactWindow`): e.g. DeepSeek 128000, Kimi 262144, MiniMax-M3 512000. `null` means we deliberately did not pin one.
+4. **Marketplace endpoints have no stable model identity.** See the next section.
+
+## Marketplace endpoints: honest-default refusal to pin
+
+Baidu Qianfan and Tencent's coding gateway are **resale/marketplace** endpoints (`marketplace: true`, `status: "marketplace-unstable"`). One `ANTHROPIC_MODEL` string can silently resolve to a different vendor or version, there is no client-side version pinning, and 2026 price hikes mean model identity is not stable over time. The recipe file deliberately leaves `autoCompactWindow: null` and ships no pinned model id for these entries — fabricating a stable pin would repeat exactly the "attempted vs used model" dishonesty already documented for gemini in [`docs/model-fallback-policy.md`](./model-fallback-policy.md). Treat the model string you send as a *request*, not a guarantee.
+
+## Data sovereignty is a separate gate
+
+PRC data-residency and Entity-List exposure are a **separate** decision from harness choice. The levers are intl endpoints, zero-retention terms, or self-hosted open weights (GLM-5.x MIT, Kimi mod-MIT, Qwen Apache-2.0) — not anything polycli does. China ToS does **not** make cc-X fragile: BYOK + a non-Anthropic base URL is documented and supported by Anthropic. The residual risk is indirect (export-screening could kill the native-Claude fallback; Anthropic could later gate the client), not a ToS trap.
+
+## Not the same as the polycli `minimax` provider
+
+polycli already has a `minimax` provider that calls official `mmx-cli` (`mmx text chat --output json --non-interactive`). That is a **stateless text/media call**, not the MiniMax cc-X coding path. If you want MiniMax-M2/M3 as a coding agent, use the cc-X recipe above (Claude Code against `ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic`), not the `minimax` provider. The `MiniMax text / multimodal` row in [`provider-paths.md`](./provider-paths.md) is the stateless-call path; this page is the coding-agent path.
+
+## Official references checked
+
+Each recipe entry in `cc-x-recipes.json` carries its own `source` URL + date. Re-verify there before relying on a base URL or model id.