(3) #[concurrency::test] macro, stress dispatcher, thread::scope shim#1534
Closed
daniel-noland wants to merge 3 commits into
Closed
Conversation
960e461 to
0ef85f9
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Reorganizes the concurrency crate to absorb the former dataplane-quiescent crate and to present a unified, parking_lot-shaped synchronization surface that routes to either real parking_lot/std/crossbeam-utils (default), loom, or shuttle based on build-time concurrency = "..." cfg. Lock APIs (Mutex::lock, RwLock::read/write) now return guards directly, so call sites across NAT, flow tables, flow info, stats, and flow-filter drop their .unwrap()/.expect()/if let Ok(...) boilerplate.
Changes:
- Add
concurrency::sync(parking_lot-shaped facade with poison→panic for loom/shuttle),concurrency::crossbeam(real or fallbackWaitGroup/ShardedLock/Backoff/CachePadded),concurrency::slotandconcurrency::quiescentmodules; introducebuild.rsdrivingconcurrency/dataplane_concurrency_slotcfgs and renamesilence_clippy→_silence_clippy. - Delete the standalone
quiescentcrate and re-route all consumers todataplane_concurrency::quiescent. - Sweep through call sites to drop
LockResulthandling now that the facade panics on poison; add#[cfg(not(miri))]guards around heavyprintln!/traced_testmacros in tests; tweak nextest filter and CI miri/loom matrix to packageconcurrencyinstead ofquiescent.
Reviewed changes
Copilot reviewed 45 out of 47 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| Cargo.toml, Cargo.lock | Drop quiescent member, add crossbeam/crossbeam-utils deps, enable tokio parking_lot, register concurrency package metadata. |
| .github/workflows/dev.yml | Replace quiescent-only miri runs with broader "extra important" packages; loom job builds concurrency. |
| justfile | Adjust nextest loom binary filter to use a regex. |
| concurrency/Cargo.toml, build.rs | New features (arc-swap, parking_lot, _strict_provenance, _silence_clippy); build script emits backend cfgs. |
| concurrency/src/lib.rs, macros.rs | New module layout, _silence_clippy rename, doc-inline re-exports for thread. |
| concurrency/src/sync/{mod,default_backend,test_facade}.rs | Backend-routed sync facade; loom/shuttle backend panics on poison. |
| concurrency/src/crossbeam/{mod,standard,fallback}.rs | Real crossbeam-utils on default; local fallbacks (Backoff, CachePadded, WaitGroup, ShardedLock) for loom/shuttle. |
| concurrency/src/slot/{mod,standard,fallback}.rs | Single-slot publication moved out of quiescent. |
| concurrency/src/quiescent.rs, QUIESCENT.md | QSBR primitives moved in; long doc retained (with stale crate references). |
| concurrency/tests/quiescent_*.rs | Tests renamed/retargeted to dataplane_concurrency::quiescent. |
| quiescent/* | Crate deleted. |
| nat/src/{portfw,stateful,icmp_handler,...}, flow-entry/src/flow_table/*, flow-filter/src/{lib,tests}.rs, net/src/flows/{display,flow_info}.rs, stats/src/vpc_stats.rs | Drop LockResult plumbing now that facade returns guards directly; restructure if let Ok(...) chains; switch try_read/try_write consumers to Option. |
| flow-entry/src/flow_table/table.rs | Replace try_read+Duration sleep with tokio::task::yield_now() loop; reshard/insert/lookup/iter use new return types; bolero test reshaped via cloned(). |
| routing/src/fib/test.rs, flow-entry/src/flow_table/{display,nf_lookup}.rs, nat/src/stateful/test.rs | Add #[cfg(not(miri))] around println!/traced_test; reduce miri packet counts. |
1182578 to
f7558f9
Compare
79be559 to
80ca8dd
Compare
80ca8dd to
f0de528
Compare
Comment on lines
27
to
31
| #[cfg(all(miri, any(feature = "shuttle", feature = "loom")))] | ||
| compile_error!("miri does not meaningfully support 'loom' or 'shuttle'"); | ||
|
|
||
| #[cfg(not(any(feature = "loom", feature = "shuttle")))] | ||
| pub use std::sync; | ||
|
|
||
| #[cfg(not(any(feature = "loom", feature = "shuttle")))] | ||
| pub use std::thread; | ||
|
|
||
| #[cfg(all( | ||
| feature = "loom", | ||
| not(feature = "shuttle"), | ||
| not(feature = "silence_clippy") | ||
| ))] | ||
| pub use loom::sync; | ||
|
|
||
| #[cfg(all( | ||
| feature = "loom", | ||
| not(feature = "shuttle"), | ||
| not(feature = "silence_clippy") | ||
| ))] | ||
| pub use loom::thread; | ||
|
|
||
| #[cfg(all( | ||
| feature = "shuttle", | ||
| not(feature = "loom"), | ||
| not(feature = "silence_clippy") | ||
| ))] | ||
| pub use shuttle::sync; | ||
|
|
||
| #[cfg(all( | ||
| feature = "shuttle", | ||
| not(feature = "loom"), | ||
| not(feature = "silence_clippy") | ||
| ))] | ||
| pub use shuttle::thread; | ||
|
|
||
| #[cfg(all(feature = "shuttle", feature = "loom", not(feature = "silence_clippy")))] | ||
| compile_error!("Cannot enable both 'loom' and 'shuttle' features at the same time"); |
cbd0617 to
e623329
Compare
51916f4 to
44abf93
Compare
44abf93 to
9591918
Compare
7470063 to
89896cd
Compare
3871d14 to
cdf0db2
Compare
89896cd to
586483b
Compare
3 tasks
Introduce a `parking_lot`-shaped facade over `Mutex` / `RwLock`, ship
the four backends together (std / parking_lot / shuttle / loom), and
rewire the data-path crates through it in a single atomic step.
The four pieces are bundled because a partial sweep leaves the
workspace in a state where the facade and its consumers disagree on
whether `Mutex::lock` returns `LockResult<Guard>` or a naked guard.
The shuttle/loom wrappers strip `LockResult` in exactly the same
poison-as-panic shape as the std wrapper, so splitting them re-
introduces that disagreement under the test/shuttle and test/loom CI
jobs.
## std::sync wrapper (poison-as-panic)
New `concurrency::sync` module exposing `Mutex` / `RwLock` with a
`parking_lot`-shaped naked-guard surface backed by a poison-as-panic
wrapper around `std::sync`. Workspace policy treats a crashed thread
as a crashed process: poison surfacing inside a lock acquire panics
through `concurrency::sync::poisoned()` rather than handing back a
possibly-torn `LockResult`.
* One cold `#[inline(never)] fn poisoned() -> !` keeps the hot path
branch-free; the wrapper is a single `match` per acquire.
* `RwLock::upgradable_read` is implemented as exclusive `write()`
for now -- sound but lossy. Documented at the backend module
level.
* `Arc`, `Weak`, atomics, `Condvar`, `mpsc`, `OnceLock`, etc. are
re-exported from `std::sync` unchanged.
## parking_lot as the production default
Adds a `parking_lot` Cargo feature (enabled by default) and
`concurrency/src/sync/parking_lot_backend.rs` -- a zero-cost
re-export of `parking_lot::{Mutex, RwLock}` plus the std re-exports
for everything `parking_lot` doesn't ship.
* Backend routing in `concurrency/src/sync/mod.rs` picks
`parking_lot_backend` when `feature = "parking_lot"` is on, falls
back to `std_backend` otherwise.
* `_strict_provenance` feature overrides `parking_lot` and routes
through `std_backend`: `parking_lot_core::word_lock` uses
integer-to-pointer casts that `-Zmiri-strict-provenance` rejects,
and the CI miri job (`--features=_strict_provenance`) exercises
the fallback slot, which needs `std::sync` underneath.
* Workspace `Cargo.toml` gets a `tokio` parking_lot override entry
so the `tokio` parking_lot feature is enabled workspace-wide.
## shuttle backend
`concurrency/src/sync/shuttle_backend.rs` mirrors `std_backend.rs`:
a thin poison-as-panic wrap around `shuttle::sync::{Mutex, RwLock}`
returning naked guards, plus re-exports for `Arc` / `Weak` / atomics
/ `Condvar` / `mpsc` that pass through unchanged. `OnceLock` falls
back to `std::sync` because shuttle does not ship one (it is
pure-std machinery and does not need model-checker integration).
Three feature flags share one `dep:shuttle`, arranged as a chain
(`shuttle_dfs` -> `shuttle_pct` -> `shuttle`):
* `shuttle` -- random scheduler (default first choice).
* `shuttle_pct` -- PCT, biases toward rare interleavings.
* `shuttle_dfs` -- DFS, exhaustive small-state exploration.
Same backend, scheduler chosen at runtime (by `concurrency::stress`,
added in the follow-on PR). The chain means a single `feature =
"shuttle"` cfg check is true under every variant, so the routing in
`sync/mod.rs` and the macro gates collapse to single-feature checks
(no `any(shuttle, shuttle_pct, shuttle_dfs)` chains).
## loom backend
`concurrency/src/sync/loom_backend.rs` mirrors the std and shuttle
wrappers: poison-as-panic around `loom::sync::{Mutex, RwLock}`
returning naked guards, with `LockResult` / `TryLockError` /
`OnceLock` etc. re-exported from `std::sync` because loom does not
ship them.
Loom 0.7 has no `Weak<T>` and no `Arc::downgrade`. The backend ships
a local `Arc<T>` wrapper that adds `downgrade` plus a `Weak<T>` shim
that holds a *strong* clone of the inner `loom::sync::Arc` until the
`Weak` itself drops. Documented quirks (`Weak::upgrade` always
succeeds after a `downgrade`, `Arc::weak_count` panics rather than
silently returning 0, last `Arc` drop does not free when a `Weak`
is live) are explicit at the module level -- they bound what loom
can model, and silent fallbacks would let assertions pass for the
wrong reason on every backend.
`sync/mod.rs` routes the loom feature ahead of every other backend.
## Workspace transition
Sweep the data-path crates (`flow-entry`, `flow-filter`, `nat`,
`net`, `pipeline`, `routing`, `stats`) to consume
`dataplane_concurrency::sync::{Arc, Mutex, RwLock, atomic, ...}`
instead of `std::sync::*` / `parking_lot::*` directly.
* Call sites drop the `.unwrap()` / `.expect("poisoned")` noise the
`LockResult`-shaped surface forced, since the facade returns
naked guards. Same shape as the existing `parking_lot::Mutex`
consumers.
* `concurrency_mode(std)` annotations at QSBR-touching test sites
in `nat::stateful::apalloc::test_alloc`,
`routing::fib::test`, and `flow-entry::flow_table::table::tests`
keep the existing std-only feature gating intact.
* `nat::portfw::flow_state::get_packet_port_fw_state`
simplification: the explicit `drop(guard)` dance before `debug!`
calls is removed (per the recorded author intent that it isn't
worth the code complexity); the guard now drops implicitly at
the end of scope.
Mechanical-by-design -- the sweep is mostly `use` rewrites and
`.unwrap()` removal. The semantic change is one step: every
consumer now gets the facade-routed primitives, so flipping the
loom/shuttle features at the top of the workspace is now
meaningful. The follow-on PR adds `#[concurrency::test]`, the
`stress` dispatcher, and the `thread::scope` shim that let a
single test source file run under any backend.
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
…ead::scope shim
Ship the test-time infrastructure that lets a single source file run
under every backend that the facade (in the prior PR) routes to.
Three pieces:
* `concurrency::stress(body)` -- runtime dispatcher selecting
`loom::model` / `shuttle::check_random` / `_pct` / `_dfs` /
direct call based on the active feature.
* `#[concurrency::test]` -- attribute proc-macro expanding to
`#[test]` + `stress(|| body)`. One source file, five execution
strategies, no `loom::model(|| { ... })` boilerplate at call
sites.
* `concurrency::thread::scope` -- works on every backend. Std and
shuttle re-export theirs; loom 0.7 has no `scope`, so this PR
ships a local shim built on `loom::spawn` plus an
`Arc<Mutex<Option<T>>>` keepalive that preserves the same
drop-affinity guarantee `std::thread::scope` gives.
## stress + #[concurrency::test]
`concurrency::stress(body)`:
* default -- `body()` direct, no exploration
* `loom` -- `loom::model(body)`
* `shuttle` -- `shuttle::check_random(body, 16)`
* `shuttle_pct` -- `shuttle::check_pct(body, 16, 3)`
* `shuttle_dfs` -- `shuttle::check_dfs(body, Some(16))`
The dispatch carries a defensive `_ => compile_error!` arm: if a
new model-checker feature is added to the exclusion list but no
matching dispatch arm is wired up, the build fails loudly rather
than silently skipping exploration.
`#[concurrency::test]` is a new attribute proc-macro in
`concurrency-macros`. Under model-checker backends it emits a
nested `mod <fn_name> { mod concurrency_model { #[test] fn
<backend>() { ... } } }` so test reports / JUnit show the active
backend as the leaf name (e.g.
`snapshot_observes::concurrency_model::loom`). Default backend
stays flat. `#[should_panic]` / `#[ignore]` thread through;
`async fn` and `fn(args)` are rejected at parse time.
The crate path is resolved at expansion time via
`proc-macro-crate`, so consumers depending on
`dataplane-concurrency` under any alias work without setup;
integration tests inside this crate use `extern crate
dataplane_concurrency as concurrency;` since cargo rejects
self-deps.
`stress` is re-exported at the crate root for the macro expansion
but hidden from rustdoc (`#[doc(hidden)]`) -- users land on the
macro.
## thread::scope shim
`concurrency::thread` becomes a module (was a `pub use` alias) and
re-exports the active backend's `thread::*`. Under loom it also
includes a local `thread::scope` shim built on `loom::spawn` plus
an `Arc<Mutex<Option<T>>>` keepalive pattern. A narrow
`mem::transmute` lifts the spawned closure's `'scope` to
`'static` (loom 0.7 has no `spawn_unchecked`); the keepalive
trait object keeps its honest `'scope` and `ScopeInner<'scope>`
lives in `ManuallyDrop` so dropck does not constrain `'scope`
past `scope()`'s body. Per-site SAFETY comments at both the
transmute and the `ManuallyDrop::drop`.
Drop-affinity is enforced by an explicit `take_payload` on main
during teardown, not by an Arc-count assertion. Loom's
`JoinHandle::join` synchronises only on `notify`, which can fire
before the spawned thread's wrapper has finished dropping its
captures -- so the count-based check fires spuriously in some
schedules. Manual `take` on main gives the same guarantee
unconditionally.
## quiescent.rs / slot.rs cleanup
With every backend now returning naked guards from `.lock()` (the
prior PR), the remaining `LockResult` unwrap dispatch in
`quiescent.rs` disappears; `.lock()` calls reduce to plain
naked-guard form across all backends.
## Tests
* `tests/quiescent_model.rs` replaces `tests/quiescent_loom.rs`.
Each test is now a plain function annotated with
`#[concurrency::test]`; the body uses `thread::scope` instead
of the `Box::into_raw` + `&'static` lifetime workaround the
old loom-only file needed. Single source runs under any
backend, including the four-way schedule exploration shuttle
offers.
* `tests/thread_scope.rs` -- contract tests for `thread::scope`
(single/multi-spawn join, lifetime borrows, drop affinity,
nested-scope re-entry). Pins shim regressions at the right
layer.
* `tests/scope_property.rs` -- bolero x shuttle property test
generating random spawn-count / per-spawn-ops plans; pins the
"scope conservation" invariant (counter at scope return
equals the sum of generated increments) across many shapes
and interleavings.
* `tests/stress_dispatch.rs` -- coarse dispatch-routing check:
the default backend invokes the body exactly once; the
model-check backends invoke it more than once. Plus
`#[should_panic]` / `#[ignore]` attribute pass-through.
* `tests/arc_weak.rs` -- contract tests for `Arc` / `Weak` that
pass under every backend, plus loom-gated tests pinning the
documented shim quirks. File-level
`#![cfg(not(feature = "shuttle_pct"))]` -- PCT cannot run
single-threaded protocol checks.
## Documentation
`lib.rs` gains a "Compiles under loom != exhaustively checked
under loom" section listing each documented shim limitation (Weak
strong-clone semantics, `weak_count` panic, RwLock
upgradable_read, OnceLock ordering invisibility, Mutex::new const
divergence) with the workspace consequence -- most notably calling
out NAT's `Weak::upgrade().is_none()` patterns and why NAT is not
in the loom matrix today.
`sync/mod.rs` gains a "Portability footguns the facade does not
paper over" section.
`justfile` nextest filter: empty under loom (the archived
binaries are already feature-filtered), `shuttle` substring under
shuttle (matches both `quiescent_shuttle` and the new
`concurrency_model::shuttle` leaves).
Verified with `cargo nextest run -p dataplane-concurrency` under
each of: default features, `--features shuttle`, `--features
shuttle_pct`, `--features shuttle_dfs`, and `--features loom
--release`. Also verified clippy passes under each backend and
under `--all-features` (the silence_clippy escape hatch routes
through `std` and the new thread / stress modules play nice with
that).
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
This was referenced May 15, 2026
Replace the two-entry `features` matrix (loom + shuttle, single
schedule each) with a single `concurrency` job whose four steps each
exercise a different scheduler:
* shuttle -- random
* shuttle_pct -- PCT (rare interleavings)
* shuttle_dfs -- DFS (exhaustive small-state)
* loom -- the loom backend; scoped to `concurrency` only
because workspace feature unification under
`feature = "loom"` breaks crates that consume
`Weak<T>` / `Arc::downgrade` from
`concurrency::sync` (loom 0.7 doesn't ship
`Weak`).
Named steps mean GitHub's check list shows the failing scheduler
directly (`concurrency / shuttle_pct`) instead of one collapsed
`features/release/shuttle` line per backend. One runner does the
nix-setup once and reuses the cargo cache across all four steps;
the matrix shape spun up two runners and paid that cost twice.
`needs: check` is added so model-checker time is only burned after
clippy / build is green on the same SHA.
Summary job's stale `needs.features` reference is renamed to
`needs.concurrency` (without the rename it always emitted
`::error:: Some features job(s) failed` because `needs.features`
resolved to `null`).
The orphan `&release-build` YAML anchor (its only consumer was the
deleted matrix) is dropped; the matrix entry it labelled stays.
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
cdf0db2 to
30fe0d5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The capstone PR. Ship the test-time infrastructure that lets a single source file run under every backend the facade (in #1542) routes to.
Three pieces, plus the test suite that exercises them.
concurrency::stress+#[concurrency::test]concurrency::stress(body)dispatches toloom::model/shuttle::check_random/_pct/_dfs/ direct call based on the active feature. The dispatch has a defensive_ => compile_error!arm: if a new model-checker feature is added to the exclusion list but no matching dispatch arm is wired up, the build fails loudly rather than silently skipping exploration.#[concurrency::test]proc-macro expands to#[test]plusstress(|| body). Under model-checker backends it emits a nestedmod <fn_name> { mod concurrency_model { #[test] fn <backend>() { ... } } }so test reports / JUnit show the active backend as the leaf name (e.g.snapshot_observes::concurrency_model::loom). Default backend stays flat.#[should_panic]/#[ignore]thread through;async fnandfn(args)are rejected at parse time.proc-macro-crate, so consumers under any alias work without setup; integration tests inside this crate useextern crate dataplane_concurrency as concurrency;since cargo rejects self-deps.stressis re-exported at the crate root for the macro expansion but hidden from rustdoc (#[doc(hidden)]) -- users land on the macro.thread::scopeshimconcurrency::thread::scopeworks on every backend. Std and shuttle re-export theirs; loom 0.7 doesn't havescope, so we shipconcurrency/src/thread/loom_scope.rs: built onloom::spawnplus anArc<Mutex<Option<T>>>keepalive that preserves the same drop-affinity guaranteestd::thread::scopegives.take_payloadon main during teardown, not by an Arc-count assertion. Loom'sJoinHandle::joinsynchronises only onnotify, which can fire before the spawned thread's wrapper has finished dropping its captures -- so the count-based check fires spuriously in some schedules. Manualtakeon main gives the same guarantee unconditionally.mem::transmuteto lift the spawned closure's'scopeto'static(loom 0.7 has nospawn_unchecked); the keepalive trait object keeps its honest'scopeandScopeInner<'scope>lives inManuallyDropso dropck doesn't constrain'scopepastscope()'s body. Per-site SAFETY comments at both the transmute and theManuallyDrop::drop.Tests
tests/quiescent_model.rsreplaces the loom-onlyquiescent_loom.rs. One source, every backend, via#[concurrency::test]+thread::scope.tests/thread_scope.rs-- contract tests forthread::scope(single/multi-spawn, drop affinity, nested-scope re-entry).tests/scope_property.rs-- bolero x shuttle conservation property over generated spawn/op plans.tests/stress_dispatch.rs--stressinvokes body once on default, more than once on model-check backends; plus#[should_panic]/#[ignore]attribute pass-through.tests/arc_weak.rs--Arc/Weakcontract (passes under every backend, plus loom-only quirk tests). File-level#![cfg(not(feature = "shuttle_pct"))]-- PCT can't run single-threaded protocol checks.justfilenextest filter: empty under loom (the archived binaries are already feature-filtered),shuttlesubstring under shuttle (matches bothquiescent_shuttleand the newconcurrency_model::shuttleleaves).Documentation
lib.rsgains a "Compiles under loom != exhaustively checked under loom" section listing each documented shim limitation (Weak strong-clone semantics,weak_countpanic, RwLock upgradable_read, OnceLock ordering invisibility, Mutex::new const divergence) with the workspace consequence -- most notably calling out NAT'sWeak::upgrade().is_none()patterns and why NAT isn't in the loom matrix today.sync/mod.rsgains a "Portability footguns the facade does not paper over" section.PR 3 of 3 in the concurrency facade stack. Depends on #1542 -- sync facade + backends + workspace sweep. After this lands, the stack reaches feature parity with the original PR.
Test plan
cargo nextest run -p dataplane-concurrency(default).cargo nextest run -p dataplane-concurrency --features shuttle.cargo nextest run -p dataplane-concurrency --features shuttle_pct.cargo nextest run -p dataplane-concurrency --features shuttle_dfs.cargo nextest run -p dataplane-concurrency --features loom --release.cargo clippy -p dataplane-concurrency --all-targets --all-features(silence_clippy path).features/release/loomandfeatures/release/shuttlepass.check/miri/powerpc64(the_strict_provenancejob) still passes.