ci(miri): parallelize with cargo-nextest so Miri uses the cores (not 45 min on one)#521
Open
avrabe wants to merge 5 commits into
Open
ci(miri): parallelize with cargo-nextest so Miri uses the cores (not 45 min on one)#521avrabe wants to merge 5 commits into
avrabe wants to merge 5 commits into
Conversation
…le cores Measured root cause of the 45-min Miri job + lean-mem contention: `cargo miri test` runs ONE Miri process — a single-threaded interpreter — pinning one core for 45 min while the rest of the (ample) box sits idle. The CPU/mem doesn't vanish into contention (peak 15 concurrent jobs, ~22s queues); Miri just can't use it, and it hogs the scarce lean-mem runner for 45 min so other lean-mem jobs queue behind it. Switch to `cargo miri nextest run`: each test runs in its own process, so the runner's cores are actually used and wall-time collapses toward the slowest single test. `--test-threads 4` bounds concurrent Miri processes under the 24G shadow-memory ceiling (conservative start; raise after a green run shows peak RSS). Same test selection: the libtest `--skip <m>` list is translated to nextest's `-E 'not (test(<m>) | ...)'` (verified locally: 721 run / 412 excluded, identical modules). Adds a resource-print step so the log shows the box's real nproc/free. Refs: REQ-215, #509 Trace: skip
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Rivet Criterion Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.
| Benchmark suite | Current: b4d5cd2 | Previous: e60a3a9 | Ratio |
|---|---|---|---|
traceability_matrix/1000 |
56223 ns/iter (± 2317) |
43193 ns/iter (± 499) |
1.30 |
query/10000 |
321001 ns/iter (± 1729) |
236806 ns/iter (± 4501) |
1.36 |
This comment was automatically generated by workflow using github-action-benchmark.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
…not 24 Measured the runner directly (the new resource-print step): 32 cores, 125 GiB RAM — the "lean-mem" label is a misnomer and memory is a non-constraint. At --test-threads 4 the parallel Miri job still timed out at 45 min (~678 tests × ~20s ÷ 4 ≈ 56 min); 28 of 32 cores sat idle. Raise to 24 threads: ~226 core-min ÷ 24 ≈ ~9-10 min, using ~72 GiB (well under 125). This is where the CPU/mem was "vanishing": into idle cores, because nothing was using them. Refs: REQ-215, #509 Trace: skip
Correction to the earlier "24 threads thrash" claim: measured effective concurrency is 23x — parallelism works fine. The real problem is work VOLUME: the 678-test sweep is ~930 test-minutes (mean 104s/test, worst 557s) of mostly safe business logic (regex/glob/HTML) that Miri runs ~100-1000x slower than native and that has no `unsafe` to validate. No thread count fits that in a per-PR budget. So split it (the #498 nightly pattern): - PR/push `miri` → only the real UB surface: the SyntaxKind `transmute`s in the rowan CST parsers (sexpr + yaml_cst), 41 tests, ~3 min. This is what Miri is FOR. - nightly/manual `miri-full` → the full Miri-compatible sweep, 24 threads on the 125 GiB runner, 90-min budget, off the per-PR path. Refs: REQ-215, #509 Trace: skip
This was referenced Jun 10, 2026
When I split Miri into scoped (PR) + full (nightly) I kept MIRIFLAGS on miri-full but dropped it from the scoped job, so it ran under default Stacked Borrows and surfaced rowan's thin-token SharedReadOnly zero-size-retag UB — which I mistakenly reported as a fresh finding. Research (subagent, source- grounded) confirms: rust-analyzer/rowan #210/#211/#212 are all closed UNMERGED (upstream is rewriting rowan, not patching); the maintainer holds that rowan should pass Tree Borrows and the failing Stacked Borrows rule is unlikely to survive Rust's final aliasing model. Our fork pin is sound under Tree Borrows, which the original job already used. Restore `MIRIFLAGS: -Zmiri-disable-isolation -Zmiri-tree-borrows` on the scoped job (matching miri-full and the pre-split job). Update the rowan pin comment in Cargo.toml to reflect the closed-unmerged/rewrite reality. Refs: REQ-215, #509 Trace: skip
…ri SB-sound Adopts phall1's unmerged rust-analyzer/rowan#212 into our fork as fix/miri-soundness-v3 (= v2 + `48a1b5e` GreenToken-as-DST + `9e7abd1` cursor SB fix). v2 was only sound under Tree Borrows; the residual thin-token SharedReadOnly zero-size-retag UB is now fixed, so rivet's rowan parsers are sound under BOTH Stacked Borrows and Tree Borrows. - Cargo.toml: pin branch v2 → v3; comment rewritten to the closed-unmerged / upstream-rewrite reality + the SB-soundness. - Cargo.lock: rowan git rev → 9e7abd1. - ci.yml: the scoped PR Miri job now gates under STACKED BORROWS (strictest; drop -Zmiri-tree-borrows) since v3 makes it sound. miri-full stays on Tree Borrows for the broader (incl. mutable-path) sweep. Confirmed: rivet-core builds against v3; native sexpr::/yaml_cst:: tests pass; rowan fork v3 builds + carries phall1's miri_node_cache_rehash / miri_cursor_free_sb regression tests; actionlint clean. Refs: REQ-215, #509 Trace: skip
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The measured answer to "where do CPU/mem vanish?"
From the #520 run timeline:
cargo miri testprocess. Miri is a single-threaded interpreter, so it pins one core for 45 min while the rest of the box is idle, and hogs the scarcelean-memrunner the whole time (that's the "contention" we saw — a symptom, not the cause). The 24 GB is Miri shadow memory, not a leak.So the resources don't vanish — Miri just can't use them.
Fix:
cargo miri nextest run(process-per-test)nextest runs each test in its own process → many Miri processes in parallel → the cores get used and wall-time collapses toward the slowest single test.
--test-threads 4: conservative start so concurrent Miri shadow-memory stays under 24 GB; raise once a green run shows real peak-RSS headroom.--skip <m>list is translated to nextest-E 'not (test(<m>) | ...)'. Verified locally: 721 run / 412 excluded, identical modules.nproc,free -h) so the log makes the available CPU/mem explicit.What to watch on this PR's CI
If 4 threads lands comfortably under memory, I'll raise
--test-threadsin a follow-up to push the wall-time down further (and can then tighten the timeout). Builds on REQ-215; Refs #509.🤖 Generated with Claude Code