LSM scratch work#1564
Closed
daniel-noland wants to merge 15 commits into
Closed
Conversation
…rty harness
Make the cascade scaffold actually usable, plus ship a reusable
property-test harness for consumer crates' `Absorb` impls. Four
pieces land together because each builds on the previous and they
share the ordering / reclamation invariants.
# Slot-based publication
The cascade now owns three `concurrency::slot::Slot`s -- head,
sealed-vec, tail -- rather than plain fields. Readers go through
a new `Snapshot<H, S, T>` value that holds an `Arc` to each of
the three; the snapshot is taken with a single `Cascade::snapshot()`
call and pins all three layers for its lifetime. This is the
QSBR-via-Arc-refcount story we designed: as long as a snapshot
exists, the cascade cannot reclaim the layers it references.
`MutableHead::seal` takes `&self` because the head runs behind an
`Arc` in production. Consuming-self semantics would require
`Arc::try_unwrap` dances that buy nothing; the contract on `seal`
is now \"snapshot the current contents; writes arriving on this
head after seal+swap may be silently lost.\"
# Rotate
`Cascade::rotate(fresh_head)` seals the current head and installs
a new empty one. Store order: new sealed vec FIRST, then new
head. Between the two stores, readers can see \"old head plus new
sealed vec containing the just-sealed snapshot\" -- duplicate
state, not missing state. Cascade walk shadows duplicates via
head-first lookup priority so readers always observe correct
results.
# Compact via MergeInto
`Cascade::compact(keep)` folds the oldest sealed layers into the
tail via the [`MergeInto`] trait that the sealed type must
implement against the tail type. `keep` controls how many sealed
layers stay at the front; pass `keep = 1` for the
cascade-depth-of-two invariant we want on the data-plane hot path,
`keep = 0` to collapse everything.
The merge logic is encoded in `S::merge_into(&self, target: &T) -> T`
rather than passed as a closure. This makes the merge:
* **Discoverable** -- new contributors can find the canonical
merge for a given layer type via `cargo doc`.
* **Consistent** -- all `compact` call sites use the same merge
by construction; no risk of two callers passing different
closures.
* **Testable** -- a trait-bound merge can be the subject of a
property harness (planned, alongside the existing Absorb
harness).
The cascade folds oldest-first: the back of the sealed slice is
merged into the old tail first, then progressively newer layers
overlay the accumulating result. This mirrors the cascade walk's
\"newer shadows older\" semantic.
Store order is symmetric to rotate: install the new tail FIRST,
then truncate the sealed vec. Between the stores, sealed shadows
tail and the merged content agrees with the still-present sealed
entries (faithful-merge contract), so the duplicate state is
harmless.
# Outcome rename
`Outcome::Miss` renamed to `Outcome::Continue`. The new name
directly captures the cascade-walk semantic (\"keep walking, this
layer has nothing definitive\") rather than the per-layer reading
(\"this key is absent in this layer\"). Documentation now also
spells out when NOT to use `Outcome::Forbid` -- consumers whose
lookup `Input` is decoupled from rule identity (ACL classifiers
keyed by Priority but looked up by Headers) should synthesise
removal semantics in user code rather than relying on Forbid.
# Property-test harness
`cascade::property_tests` (gated behind a `bolero` feature)
exposes `check_absorb_order_independent<V>()` for consumer crates
to verify their `Absorb` impls against the cascade's algebraic
laws. Self-tested against `LastWriteWins<u32>` via 1024+
generated op-triples per run.
# Tests
14 tests pass, clippy clean under `--features bolero`. The
`snapshot_held_across_*` tests are load-bearing for the QSBR
reclamation story: they confirm a snapshot taken before a
rotate/compact continues to observe the pre-mutation composition.
Single-writer LSM-manager assumption documented on rotate and
compact; the eventual LSM-manager wrapper will own the
serialization.
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
First real-shaped consumer exercises the cascade trait surface
against a use case that is NOT exact-match-keyed. ACL rules are
identified by priority but looked up by packet header match, so
the layer's Input (Headers) and the head's storage key (Priority)
are different things. If the cascade shape holds for ACL it
almost certainly holds for anything simpler.
# What the slice covers
AclRule, Priority, Action, Match, Headers -- minimal types
AclSealed: Vec<AclRule> sorted by priority -- Layer impl
AclHead: Mutex<BTreeMap<Priority, AclRule>> -- MutableHead impl
AclOp::Install(rule) -- only Install for now
merge_acl -- compactor closure
# What it surfaces
The slice is annotated with three open design questions at the
bottom of the file:
1. **Tombstones for rule removal.** `Outcome::Forbid` is keyed by
the layer's `Input`. For exact-match maps a tombstone on key
K means \"K is absent\"; the cascade walk consults the same K
in lower layers and the tombstone short-circuits. For ACL the
Input is packet Headers and rules are identified by Priority,
not by Input. Tombstoning a rule means \"this RULE is gone,\"
not \"these PACKETS are forbidden,\" so the existing tombstone
mechanism does not apply directly.
acl-stack's `update.rs` solution: when removing a rule, insert
a SHADOW rule in the delta layer with the same match expression
and the table's default action, at a precedence that shadows
the original. The cascade walk hits the shadow first and
short-circuits.
This works but means `Op::Remove` must carry enough information
to synthesise the shadow -- which means the head needs the
original rule's match expression, which it does not have
(the rule lives in a lower layer). Open for design discussion.
2. **Head readability under high write rate.** This slice's head
returns Miss from `Layer::lookup` always; writes become visible
only after the next rotation. Fine for low-rate ACL updates;
NOT fine for conntrack-shaped consumers where new flow entries
must be readable immediately by the lcore that just wrote them.
That probably means `Layer::Output` needs a GAT so the head's
lookup can return something other than `&Self::Output` --
borrows that re-borrow through an Arc-published internal
structure, or owned values for cheap-to-clone types. Parked
for a follow-on session.
3. **Op shape works as designed.** `MutableHead::Op` and
`Absorb::Op` being different types lets the head decompose
per-head ops (Install carries a rule with its priority) into
per-key absorb-able pieces. No friction here.
# Tests
Six new tests, all passing, clippy clean, fmt clean (20 total in
the cascade crate now):
- empty_cascade_returns_no_match
- default_allow_in_tail_matches
- install_rule_takes_effect_after_rotation
(note: write -> read latency = one rotation interval)
- higher_precedence_rule_shadows_lower
- cascade_walk_respects_sealed_order
- compact_collapses_layers_preserving_precedence
This slice does NOT integrate with acl-stack's Classifier. It is
a standalone consumer-shaped test that surfaces the same design
pressure acl-stack would. Real integration with acl-stack is the
next step, but the open questions above should be resolved first.
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Closes the cascade design chapter by wiring up the drain-event
subscription mechanism. Subscribers receive `Arc<S>` (the
freshly-sealed layer) on every rotation, decoupling the cascade's
internal lifecycle from the consumers that react to it (hardware
programmer, replicator, software classifier service, etc.).
# Feature gate
Behind the `subscribe` feature so consumers driving the cascade
synchronously do not pull tokio. The crate compiles cleanly
under `--no-default-features` as well as
`--features subscribe,bolero`.
# API surface
Cascade::new(head, tail)
Constructs with the default drain channel capacity.
Cascade::with_drain_capacity(head, tail, capacity)
Explicit capacity override for tuning lag tolerance.
Cascade::subscribe(&self) -> broadcast::Receiver<Arc<S>>
Each call returns a fresh receiver. Sees only drains that
occur after the subscribe call -- no backfill. Consumers
pair this with Cascade::snapshot() to capture the
pre-subscription state.
Cascade::subscriber_count(&self) -> usize
Diagnostic.
# Lifecycle
`rotate` emits the freshly-sealed `Arc<S>` after both slot stores
complete. This means subscribers never observe a sealed layer
that the cascade itself has not yet published, and the Arc the
subscriber receives is pointer-equal to the Arc stored in the
sealed vector (verified by
`rotate_emitted_arc_is_the_same_as_in_sealed_vec`). Sharing the
allocation keeps subscriber holds aligned with the cascade's
reclamation -- when the last subscriber drops their Arc and the
cascade has compacted past the layer, the allocation drops.
Send is non-blocking. No subscribers means send returns Err
which we deliberately swallow (the no-subscriber case is normal,
not exceptional). A full broadcast buffer means lagging
subscribers receive RecvError::Lagged(n) on their next recv; the
convention is for them to react by resyncing from a fresh
Snapshot. The cascade is never blocked by slow consumers.
`compact` does NOT emit. Compaction is internal lifecycle that
does not change the data visible to subscribers' merged state --
the just-folded entries were already delivered when their sealed
layers were first emitted by their rotation. Emitting on
compact would double-deliver and complicate consumer state.
# Race semantics
A drain that happens between `subscribe()` and `snapshot()` is
visible in both the snapshot AND the receiver's queue. This is
documented as a contract: consumers' state updates must be
idempotent under repeated application of the same `Arc<S>` (or
the consumer must implement its own de-duplication via the
sealed layer's generation tag once we add one). For the
MergeInto-based merge model this is automatic -- applying the
same delta twice is a no-op once the first application is in.
# Tests (6 new)
- rotate_emits_drain_event_to_subscriber: basic delivery
- multiple_subscribers_each_get_their_own_copy: fan-out
- no_subscribers_does_not_panic_on_rotate: empty-fanout safety
- subscriber_created_after_rotate_misses_that_drain: no
backfill semantic
- slow_subscriber_sees_lagged_when_channel_overflows: capacity
exhaustion produces RecvError::Lagged
- rotate_emitted_arc_is_the_same_as_in_sealed_vec: Arc identity
between subscriber-side and cascade-side allocations
All 26 tests pass. Clippy clean under
`--features bolero,subscribe`. `--no-default-features` compiles
the crate without tokio.
# Helper added
`Snapshot::sealed(&self) -> &[Arc<S>]` exposes the sealed-vec
contents for callers that need direct Arc access (the test
`rotate_emitted_arc_is_the_same_as_in_sealed_vec` uses it for
the pointer-equality check; consumer code is welcome to use it
for read-side enumeration of sealed layers without going through
the full cascade walk).
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
First slice of the ACL chapter. A real (non-test-file) `acl` crate
that uses cascade as the publication and storage primitive,
designed deliberately small so the cascade integration shape gets
nailed down before we grow the surface.
# What landed
## Crate scaffold
`acl/Cargo.toml` -- depends on cascade and concurrency, nothing
else. Registered in workspace as `dataplane-acl`.
## Core types (acl/src/types.rs)
Priority(u32) -- lower value is higher precedence (acl-stack
convention). PartialOrd/Ord by numeric value.
Action -- Allow | Drop. Will grow when consumers need
Redirect / Trap / Count / etc.
Protocol -- Tcp | Udp | Icmp. Closed enum; will grow.
Headers -- minimal placeholder: src/dst IPv4, protocol,
src/dst port. Documented as the swap site for
`net::headers::Headers` once we are ready to
take the net dependency.
Match -- per-field wildcard-or-equal conjunction. Empty
match (`Match::any()`) matches every packet.
AclRule -- priority + match + action. Copy.
## Cascade layers (acl/src/layers.rs)
AclHead -- Mutex<BTreeMap<Priority, AclRule>>. Layer::lookup
returns Continue (writes visible after the next
rotate). MutableHead with Op = AclOp::Install(rule);
seal copies the BTreeMap into an AclSealed.
AclSealed -- immutable priority-sorted Vec<AclRule>. Layer::lookup
walks in precedence order and returns first match.
AclTail -- structurally identical to AclSealed for this first
slice. Kept as a distinct nominal type so we can
later swap in a DPDK-backed tail (wrapping
`dpdk::acl::AclContext`) without touching the cascade
composition.
Absorb for AclRule: last-writer-wins (concurrent installs at the
same priority resolve to whichever arrived last in the BTreeMap's
internal ordering -- acceptable because priority IS the rule's
identity; the control plane should not issue conflicting installs).
MergeInto<AclTail> for AclSealed: dedup-by-priority with
newer-wins-on-conflict, mirroring the cascade walk semantic.
## Classifier (acl/src/classifier.rs)
The consumer-facing API. Wraps `Cascade<AclHead, AclSealed, AclTail>`
and exposes:
Classifier::new(default_action)
Classifier::classify(&headers) -> Action
Classifier::install(AclRule)
Classifier::rotate()
Classifier::compact(keep)
Classifier::snapshot() -> Snapshot
Classifier::sealed_depth() -> usize
Classifier::default_action() -> Action
Removal is intentionally NOT exposed as a method. Per the cascade
design conversation, ACL removal requires synthesising a shadow
rule with the table's default action at a precedence that beats
the rule being removed; this lives in user code (where the default
action is known and the original rule's match expression is
accessible via Snapshot) rather than in the framework.
## Tests (13 total, all passing)
Unit tests on types (5):
any_matches_everything
single_field_match
single_field_mismatch_rejects
conjunction_requires_all_fields
priority_lower_is_higher_precedence
Integration tests on Classifier (8):
empty_classifier_returns_default_action
install_alone_is_not_yet_visible
install_then_rotate_makes_rule_visible
lower_priority_value_shadows_higher_value
src_ip_narrows_rule_to_allowlist
rules_from_multiple_rotations_compose_correctly
compact_preserves_classification_results
snapshot_held_across_rotation_pins_old_state
The integration tests use only the public `Classifier` surface
-- no peeking into cascade internals. That pins the consumer
contract and surfaces any ergonomics gaps as test friction.
The `snapshot_held_across_rotation_pins_old_state` test is load-
bearing for the QSBR reclamation story applied to ACL specifically:
a held snapshot keeps observing its pre-rotate composition while
fresh snapshots see the new state.
# What is intentionally not here
- Removal (see comment above; user-code shadow-rule synthesis)
- DPDK-backed tail (will be a `DpdkAclTail` variant in a follow-on)
- Subscription wiring for hardware programmer / replicator
(cascade exposes broadcast; classifier will eventually surface it)
- net::headers::Headers integration (placeholder Headers type
documented as the swap site)
- Richer Match expression types (ranges, prefixes, masks)
- Property tests for the MergeInto<AclTail> impl
Each of those is a separate slice to land deliberately when
we have a concrete need. The intent of this commit is to nail
down the cascade-integration shape on real Rust code, not to
ship a feature-complete ACL.
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Two terminology refreshes that the worktree TODOs flagged. Both
mechanical; no semantic change.
# Absorb -> Upsert
The trait was previously named `Absorb` (carrying the merge-into-
self semantic). Renamed to `Upsert` because:
* The semantic is literally upsert: `seed` constructs the value
from the first op observed for a key; `upsert` folds each
subsequent op into the existing value. The new name directly
names the operation.
* `Absorb` overlaps with the `left-right` crate's `Absorb` trait,
which has different contractual obligations. Naming overlap is
a real source of cognitive load when both crates are in the
workspace dependency tree.
* The method is also renamed `absorb` -> `upsert`.
The trait keeps its two-method shape (`seed` for first-insert,
`upsert` for subsequent updates). Whether we collapse to a
Default-bounded single method is a separate design conversation;
that one is parked.
File rename: `cascade/src/absorb.rs` -> `cascade/src/upsert.rs`;
`cascade/tests/absorb_properties.rs` ->
`cascade/tests/upsert_properties.rs`.
# MutableHead::Sealed -> MutableHead::Frozen + seal -> freeze
The associated type was previously named `Sealed`. Renamed to
`Frozen` because `Sealed` overlaps with the well-known
"sealed trait" pattern in Rust (where a trait is sealed against
external implementations), and reusing the name as an associated
type was confusing in practice.
`Frozen` reads more accurately for what the type represents: the
immutable layer produced by freezing the head's contents. The
method `seal` is renamed correspondingly to `freeze`.
The cascade's generic position changes from `Cascade<H, S, T>` to
`Cascade<H, F, T>` to match. Variable names (`new_sealed` ->
`new_frozen`, etc.), field names (`sealed: Slot<Vec<Arc<F>>>` ->
`frozen: Slot<Vec<Arc<F>>>`), and accessors (`sealed_depth` ->
`frozen_depth`, `sealed()` -> `frozen()`) follow.
# Pin / mem::replace consideration
The user flagged that the "frozen" semantic is spiritually similar
to `Pin<T>` on `?Unpin` data: something promised not to be moved
or swapped out. In our cascade, that invariant is provided by
`Arc` indirection rather than by Pin -- once a layer is in
`Arc<Frozen>` the only access is through `&Frozen`, so no
`mem::replace`-shaped escape hatch is reachable from outside the
cascade-manager thread. No code change needed; noted here for
when async paths get added.
# What did NOT change
* Two-method shape on Upsert (seed + upsert). Whether to
collapse via Default bound is parked.
* The Layer trait (still TODO to potentially merge into a
Lookup-shaped GAT trait; deferred until a real consumer
demands GAT-on-Output).
* Cascade's three-type-parameter shape (`Cascade<H, F, T>`).
The user TODO about eliminating F via `H::Frozen` is real but
a separate refactor.
# Validation
* 39 tests pass (26 cascade + 13 acl), clippy clean under
`--features bolero,subscribe`, fmt clean.
* Workspace compiles cleanly with the new sysroot
(`DATAPLANE_SYSROOT=$(readlink -f sysroot) cargo check`).
* `--no-default-features` build of cascade still works (the
feature gates on `bolero` / `subscribe` are unchanged).
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
The cascade's frozen vector now stores FrozenEntry { generation,
layer }; rotate(generation, mk_head) takes the gen from the caller
instead of allocating one internally. Snapshot::lookup_at(input,
horizon) walks only frozen entries with gen <= horizon for
Reitblatt-style per-packet consistency. compact_through(watermark)
folds layers at or below the watermark into the tail.
Generation is NonZeroU64 internally with a wire_stamp() projection
to u24 for NIC ACL stamp fields. The cascade owns no counter;
allocation, wrap handling, and reset belong to the pipeline manager
(forthcoming dataplane-mat-runtime crate).
DrainEvent is no longer feature-gated on `subscribe` so downstream
facades can name it without pulling tokio.
See .scratch/mat-pipeline-rfc/0001-mat-pipeline.md for the design.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Classifier::rotate now takes a Generation supplied by the caller; in production this comes from the pipeline manager's policy-gen allocator. Adds Classifier::compact_through(watermark) for per- packet-consistent compaction. Re-exports Generation from cascade so direct downstream consumers do not need a cascade dependency at call sites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New dataplane-mat crate. Defines the consumer-facing trait surface that the forthcoming pipeline runtime and its plugins program against: - OriginId, OriginSeq, FlowOrigin for cross-dataplane state-sync metadata - TransportSeq and StateSyncMessage for wire framing - MatSubscriber<H, F> trait for things that consume DrainEvent - WatermarkReporter trait for opt-in compaction gating No DPDK / k8s / tokio dependencies. Re-exports Generation from cascade. See .scratch/mat-pipeline-rfc/0001-mat-pipeline.md for the design. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New dataplane-mat-runtime crate that provides the reusable pipeline manager building blocks. Owns the dataplane-wide policy generation counter and bundles cascades with their subscribers. PolicyGenAllocator implements the two-atomic begin-rollout / publish dance: `next` is the monotonic allocator, `current` is the published horizon workers read at batch boundaries. `next` starts one above `current` so a staged-but-unpublished rollout is invisible to workers (its frozen layer has gen > current, so `Snapshot::lookup_at` skips it). `publish` is the commit point. ManagedCascade wraps a Cascade with two registries: MatSubscribers (receive DrainEvents synchronously on every rotation) and WatermarkReporters (their min is computed by `compact_to_aggregated_watermark` to drive `Cascade::compact_through` safely). The cascade is private to the wrapper to prevent bypass of subscriber fan-out. A HeadFactory closure is stored at construction so the per-NF rotate site stays terse. Twelve tests covering allocator semantics, rollout commit-at-publish, subscriber fan-out, watermark aggregation including the "any-reporter-None blocks compaction" rule, and the combined subscriber+reporter pattern that the hardware-offload programmer will use. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Integration test exercising a flavor-B cascade end-to-end through ManagedCascade + PolicyGenAllocator. Defines a minimal FlowKey, a FlowEntry value type that carries FlowOrigin in its metadata, and an Upsert impl that resolves conflicts via (origin_id, origin_seq) lexicographic LWW. Tests: - Locally inserted flow visible after rotation+publish. - LWW resolution converges regardless of arrival order (the central active-active convergence property). - Same-origin higher-seq wins. - Equal-key writes are idempotent. - Long-lived flow persists across policy-gen advance (the conntrack "outlives the authorising rule" property). - MergeInto reconciles via LWW during compaction. Surfaced design pressure: the cascade walk is newer-shadows-older, not LWW-reconciling. A higher-LWW entry sitting in an older sealed layer is shadowed by a lower-LWW newer-layer entry on lookup until compaction reconciles. Production state-sync must dedup on the receive side so cascade.write only carries the current LWW winner. Documented in the RFC's known-unknowns section as item 9. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a small trait that the receiver-side state-sync machinery uses to pull FlowOrigin out of an incoming entry without forcing the value type to expose FlowOrigin as a field. Lets value types that pack origin metadata into a smaller representation (bit- stealing in a u128, etc.) participate in cross-dataplane replication. Blanket impl on FlowOrigin itself so it can stand in for a value type in tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
First plugin crate. Provides the receiver-side dedup and policy- gen buffering machinery that the production state-sync transport will compose with serialisation and connectivity management. PeerDedup::accept tracks per-origin high-water (origin_seq) and classifies an incoming message as one of: - Apply: passed dedup and policy_gen_at_create is at or below current_policy_gen; ready to write to the cascade. - Skip: origin_seq <= seen[origin_id]; duplicate or out-of-order. - Buffered: policy_gen_at_create > current_policy_gen; stashed in a BTreeMap keyed by gen for later release. PeerDedup::advance_policy_gen drains the buffer up to the new gen and returns the released entries (caller writes them to cascade). PeerDedup::drop_buffered_from_peer evicts a dead peer's buffered entries on k8s / health-probe signal. The seen-tracker is NOT reset on eviction: recovery is via fresh snapshot resync, not in-band retransmit. This addresses the LWW-at-write-time mitigation called out in RFC known-unknown #9: by deduping on the receive side, each cascade.write only carries the current LWW winner. 12 tests cover the full surface: first-receive, replay, out-of- order, distinct-origin independence, buffer/release across gen advances, and peer eviction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires together every piece built so far: two ManagedCascades each with a ShipToPeer subscriber and a PeerDedup, joined by in-memory wires. Exercises the full round trip: local write -> ship -> peer dedup -> apply -> visible at peer. Shared test types extracted into tests/common/mod.rs so flow_state and e2e_sync don't duplicate them. Surfaced design pressure: cross-dataplane policy_gen comparison is unsound. policy_gen is dataplane-local; two dataplanes have no shared integer scale. The buffer comparison in PeerDedup (entry.policy_gen_at_create > receiver.current_policy_gen) was mixing values from independent timelines. Documented as RFC known-unknown #10 with two viable resolutions: reintroduce a control-plane-supplied config_ref, or drop the buffer entirely. E2e tests pass an "effectively unbounded" horizon to bypass the unsound comparison. The buffer mechanism remains tested in isolation in mat-state-sync/tests/dedup.rs where the contrived policy_gen scale is appropriate for unit tests. Five end-to-end tests: - single-write propagation - replay absorbed via dedup - concurrent writes converge via LWW (after compaction) - shipper filter suppresses amplification echo - buffered entries drain via advance_policy_gen Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PolicyGenAllocator gains the wrap/reset machinery needed for the
u24 wire-stamp scheme:
PressureLevel { Free, Throttle, Aggressive, Block } reports the
regime based on the wire stamp (next % 2^24) of the next
allocation:
- Free (< 2^23): allocate normally.
- Throttle (>= 2^23): manager slows policy intake, starts
opportunistic reset attempts.
- Aggressive (>= ~7/8 of 2^24): exponential back-pressure,
actively try to quiesce for reset.
- Block (>= 2^24 - 2): refuse new rollouts until reset.
Thresholds exposed as public consts (THROTTLE_AT, AGGRESSIVE_AT,
BLOCK_AT) so callers can reason about them.
try_reset succeeds when next == current + 1 (no rollout staged)
and returns ResetError::RolloutStaged otherwise. Caller-enforced
preconditions (frozen chains empty, in-flight stamps drained, no
long-running snapshots) are documented but cannot be verified by
the allocator itself.
The check is on the wire stamp (modulo 2^24), not the internal
u64 counter -- the constraint that drives the regime is u24 NIC
stamp space, not internal counter exhaustion.
Seven new unit tests covering pressure regime transitions, reset
preconditions, and post-reset state.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5ac8c35 to
29b90e5
Compare
Collaborator
Author
|
this did the thing for the moment and is too big to review as such. Closing in favor of adding this logic in a different order |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.