Skip to content

[codex] Harden Retriever harness artifact safety#2286

Draft
jioffe502 wants to merge 2 commits into
NVIDIA:mainfrom
jioffe502:codex/retriever-harness-followup
Draft

[codex] Harden Retriever harness artifact safety#2286
jioffe502 wants to merge 2 commits into
NVIDIA:mainfrom
jioffe502:codex/retriever-harness-followup

Conversation

@jioffe502

Copy link
Copy Markdown
Collaborator

Summary

  • centralize credential redaction for structured harness JSON, JSONL, runfiles, overrides, and --json results
  • atomically replace JSON artifacts and track only files successfully produced by the current run
  • reject non-empty single-run and runset output directories so stale BEIR, query, or LanceDB artifacts cannot be reused
  • map artifact initialization and write failures to the documented exit code 30 without traceback paths
  • add focused protocol tests and document the output-directory contract

Root cause

Artifact writes bypassed ArtifactWriter in several execution paths, the result manifest inferred current artifacts by scanning known filenames, and writer initialization happened outside artifact-failure handling. As a result, resolved secrets could be serialized directly, reused directories could advertise stale files, and I/O failures could surface as exit 70 or an uncaught traceback.

Impact

Harness consumers can continue treating artifacts and exit codes as the stable interface. Structured artifacts and JSON output no longer expose credential fields, manifests describe only the current run, unsafe directory reuse fails before mutation, and artifact failures consistently report exit 30.

Validation

  • uvx pre-commit run --files ... (all configured hooks passed)
  • uv run pytest -q tests/test_harness_artifact_protocol.py tests/test_beir_evaluation.py (34 passed)
  • real jp20_beir --dry-run --json with a reranker API-key override: exit 0, complete status, secret absent from stdout and structured artifacts, no execution-only artifacts in the manifest
  • repeated the real command against the same output directory: exit 30, no traceback, existing artifacts unchanged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant