ci(claude): bump max-turns 24→48 + sharpen anti-pattern in Tools section by heskew · Pull Request #452 · HarperFast/harper

heskew · 2026-05-01T22:57:30Z

Summary

Fixes the immediate symptom from harper PR #450's failed review (run 25234303343): agent burned ~20 turns on `Bash(grep -rn …)` repo-wide searches, hit `--max-turns 24`, posted no review comment, log step skipped. Diff was 5 files / +83/−8 — exploration pattern, not diff size, drove the cost.

What changes

`--max-turns 24 → 48` — band-aid. Doubles worst-case budget; ends the immediate dropouts on PRs with non-trivial cross-symbol exploration.
`## Tools` section rewrite:
- Adds a turn budget hint (1–2 turns context, 1 turn diff, ~1 per file, rest for searches + posting).
- Names repo-wide search as THE biggest turn-burner with a concrete failure-mode citation ("recent run that timed out before posting anything").
- Points at `Grep` tool with `path` parameter scoping as the right tool for the job.

Open question (followup)

In the failed run, the agent's `Bash(grep -rn …)` calls all came back with `is_error: false` — they ran. But `Bash(grep …)` shouldn't match any of the existing allowlist patterns (`Bash(git diff:)`, `Bash(git log:)`, `Bash(git blame:)`, `Bash(git show:)`, `Bash(gh ...:*)`).

Either:

claude-code-action permits any `Bash(...)` once the tool is enabled at all, or
there's a matching gotcha in our patterns.

Worth digging into so the prompt instruction is backed by an actual deny at the action level. Not blocking this PR.

Test plan

Push to a PR after this lands → confirm review completes and posts a marker'd comment.
Confirm the agent's tool-call pattern: prefers `Grep` over `Bash(grep)`, doesn't do repo-wide recursive searches.
Re-trigger PR Read-only mode #450's review (or wait for next push) → confirm review now lands.

Followup

oauth mirror once this merges.

🤖 Generated with Claude Code

Diagnosis from harper PR #450 review failure (run 25234303343): agent burned ~20 turns on `Bash(grep -rn …)` searches across the whole repo before hitting `--max-turns 24`, posted no review comment, and the log step skipped (no marker'd comment to log). Diff was small (5 files, +83/−8) — exploration pattern, not diff size, was the cost driver. Two changes: 1. **--max-turns 24 → 48.** Cheap band-aid that ends the immediate dropouts on PRs with non-trivial cross-symbol exploration. Per the existing comment, this is the real cost ceiling — bumping doubles the worst-case budget and moves us out of the 'one exploratory grep too many = no review at all' regime. 2. **Tools section: explicit turn-budget hint + sharpened anti-pattern.** Names "repo-wide search" as THE biggest turn-burner, points at the `Grep` tool with `path` scoping, and cites the failure mode ("recent run that timed out before posting anything") so the agent has concrete signal that matches its observed behavior. Out of scope: investigating whether `Bash(grep …)` is supposed to be denied by the existing `--allowedTools` list (current entries scope to `Bash(git diff:*)`, `Bash(git log:*)`, `Bash(git blame:*)`, `Bash(git show:*)`, `Bash(gh ...:*)` — `Bash(grep …)` shouldn't match any pattern but ran cleanly in #450's failed run, with `is_error: false` on every grep tool call). Either claude-code- action permits any `Bash(...)` once the tool is enabled, or there's a matching gotcha. Tracked as a follow-up; this PR addresses the immediate symptom. oauth needs the same change — will mirror after this lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mirror of HarperFast/harper#452. Same fix applied to the oauth review workflow: max-turns doubled, turn-budget hint added, repo-wide search called out as the single biggest turn-burner. The triggering harper run is referenced as the concrete failure mode rather than re-running the same diagnosis on oauth (oauth hasn't seen the same failure yet, but the prompt and budget are identical, so the same exposure exists). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Symptom: every PR was failing `Auth gate invariants / validate` with `authorize job has no step setting USERS_TO_CHECK env var`, even on workflows that clearly set it. Reported on PR #452 and others. Cause: the check `USERS_TO_CHECK presence` (added in #417 review- fixup) was using a jq-flavored expression — `// empty` in particular — but yq on ubuntu-latest is `mikefarah/yq` (Go), not jq. yq's lexer rejects `empty`. The script had `2>/dev/null` on the yq invocation, so the lexer error was eaten and the variable came back as the empty string, tripping the existence check on every workflow. Fix: rewrite the expression in idiomatic yq: - `.jobs.authorize.steps[].env.USERS_TO_CHECK | select(. != null)` — emits a stream of one value per step that sets the env var, skipping steps that don't. - `head -1` collapses the stream to a single value (or empty) for the `[ -n ]` check. Verified locally with mikefarah/yq v4.53.2 against all three harper claude-*.yml workflows — all pass. Out of scope but worth following up: - The `2>/dev/null` swallowed a real yq error. Worth either removing the suppression or capturing stderr for diagnostics when the expression returns empty. - oauth's mirror (#68) carries the same bug; a copy of this fix needs to land there too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Symptom: every PR was failing `Auth gate invariants / validate` with `authorize job has no step setting USERS_TO_CHECK env var`, even on workflows that clearly set it. Reported on PR HarperFast#452 and others. Cause: the check `USERS_TO_CHECK presence` (added in HarperFast#417 review- fixup) was using a jq-flavored expression — `// empty` in particular — but yq on ubuntu-latest is `mikefarah/yq` (Go), not jq. yq's lexer rejects `empty`. The script had `2>/dev/null` on the yq invocation, so the lexer error was eaten and the variable came back as the empty string, tripping the existence check on every workflow. Fix: rewrite the expression in idiomatic yq: - `.jobs.authorize.steps[].env.USERS_TO_CHECK | select(. != null)` — emits a stream of one value per step that sets the env var, skipping steps that don't. - `head -1` collapses the stream to a single value (or empty) for the `[ -n ]` check. Verified locally with mikefarah/yq v4.53.2 against all three harper claude-*.yml workflows — all pass. Out of scope but worth following up: - The `2>/dev/null` swallowed a real yq error. Worth either removing the suppression or capturing stderr for diagnostics when the expression returns empty. - oauth's mirror (HarperFast#68) carries the same bug; a copy of this fix needs to land there too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Spike on PR HarperFast#452's failed run found that `Bash(grep -rn …)` calls all executed (`permission_denials: []`) despite not matching any listed `Bash(...:*)` pattern. Per claude-code-action's own `docs/configuration.md` (v1.0.110): "The base GitHub tools are always included. Use `--allowedTools` to add additional tools, and `--disallowedTools` to prevent specific tools from being used." `--allowedTools` is ADDITIVE, not exclusive. In CI's non-interactive mode (no `canUseTool` callback), the SDK's `default` permission mode falls through to "execute" rather than prompting. So the comment block in claude-mention.yml claiming "Tool allowlist is a security boundary — every entry is a potential RCE primitive if a prompt injection succeeds" is overstated. The allowlist doesn't gate; the actual containment is elsewhere. The "deliberately absent / tightened" entries (`Bash(npx:*)` absent, `Bash(npm install)` no-arg) are documentation of intent, not enforcement — an injected instruction that ran them would still execute. This commit replaces those comment blocks in both claude-mention.yml and claude-issue-to-pr.yml with honest accounts of where containment actually lives: 1. Token scope (repo-scoped, no cross-repo reach). 2. Branch protection on protected refs. 3. Auth gate (CODEOWNERS-driven team check) — narrow trigger surface. 4. Allowed labels list (issue-to-pr only). 5. Runner ephemerality. 6. Prompt-injection guards in the prompt itself. The notable-absences notes are kept but reframed as prompt-level discipline rather than allowlist enforcement, with explicit notes that real mitigation (e.g., `ignore-scripts=true` in `.npmrc`, deferring installs to a separate CI job) is a future concern. Comments-only. No behavior change. Paired with `HarperFast/ai-review-prompts#8` which fixes the same misleading comment in the reusable `_claude-review.yml`. claude-review.yml in this repo will be replaced by the caller pattern when HarperFast#8 + the harper migration land — fixing it here is unnecessary churn. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ew-prompts reusable Day 1 of revised Plan A — replaces harper's inline 463-line `claude-review.yml` with a ~75-line caller of the reusable workflow shipped in `HarperFast/ai-review-prompts#8`. The single `uses:` ref pin (currently `0a5ccbc6...` = ai-review-prompts main 2026-05-05) controls everything that needs to move together: * The workflow logic itself (the reusable's authorize + review jobs, including the marker-based comment edit, the learnings-channel via $RUNNER_TEMP, the post-HarperFast#444/HarperFast#447/HarperFast#452 calibration shape). * The layer files referenced by `review-layers` (universal, harper/common, harper/v5). * The bash scripts the reusable invokes (compose-review-scope, find-prior-review-comment, log-review-to-ai-review-log, authorize-claude-workflow). * The fix-up commit baking in the honest-allowlist comment block (no longer claiming "Tool allowlist is a security boundary"). What lands * `.github/workflows/claude-review.yml` — replaced with the thin caller. Inputs: `review-layers` (universal + harper/common + harper/v5), `repo-specific-checks` (the Harper-core block, including the "this repo IS Harper core, so 'defer to Harper docs' applies to PLUGIN docs, not in-repo docs" caveat that was previously inline). Secrets passed through. * Removes three scripts the reusable now owns: `.github/scripts/compose-review-scope.sh`, `find-prior-review-comment.sh`, `log-review-to-ai-review-log.sh`. * Keeps three scripts still used by the inline mention / issue-to-pr workflows: `.github/scripts/authorize-claude-workflow.sh` (used by mention + issue-to-pr's authorize jobs; same script as the reusable, intentional duplication until those workflows migrate too), `parse-claude-mention.sh`, `validate-auth-gate-invariants.sh`. * Updates `validate-auth-gate-invariants.sh` to handle caller-pattern workflows: when a `claude-*.yml` has no inline authorize job, the validator now confirms the workflow invokes a `HarperFast/...` reusable via `uses:` pinned to a 40-char SHA (mutable refs would let an attacker silently repoint to a weakened version of the reusable). The reusable's structural invariants are validated separately by ai-review-prompts' own auth-gate-invariants.yml. What's NOT in this PR * `claude-mention.yml` and `claude-issue-to-pr.yml` stay inline. Reusables for those will land later as Day 2 of the revised Plan A; same caller-migration pattern. * oauth migration to caller — separate PR after this lands. Verified locally * YAML parses for both the new caller and the updated validator. * `bash .github/scripts/validate-auth-gate-invariants.sh` against all three current workflows: passes. `claude-review.yml` is recognized as caller-pattern and the SHA pin is enforced. The two inline workflows still pass the structural check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

heskew requested review from a team as code owners May 1, 2026 22:57

heskew merged commit 0118c1c into main May 1, 2026
22 of 25 checks passed

heskew deleted the workflow/bump-max-turns-and-search-discipline branch May 1, 2026 23:01

heskew mentioned this pull request May 5, 2026

ci(claude): clarify --allowedTools is convenience, not enforcement #473

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(claude): bump max-turns 24→48 + sharpen anti-pattern in Tools section#452

ci(claude): bump max-turns 24→48 + sharpen anti-pattern in Tools section#452
heskew merged 1 commit into
mainfrom
workflow/bump-max-turns-and-search-discipline

heskew commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

heskew commented May 1, 2026

Summary

What changes

Open question (followup)

Test plan

Followup

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant