ORE v2 (6/n): chained-prefix variable-length scheme + string encryption#83
ORE v2 (6/n): chained-prefix variable-length scheme + string encryption#83coderdan wants to merge 5 commits into
Conversation
…nit, left-query comparator (#83 review) Review follow-ups on the chained scheme: - zeroize: wipe L = E_k(0) in CmacAccumulator::new (K1's source; spec section 9) and the per-block `ro` RO-tag buffer in encrypt_var (matches the bit2_w6 scratch wipe). Existing Drops (k1/state/k_acc/stream) were already correct. - Replace the BRANCH_* u8 consts + a would-be debug_assert with a non-zero enum Branch { RoKey = 1, PrpStream = 2 }; final_block takes Branch, so a final block's byte 0 can never be 0x00 and collide with a prefix block — the injectivity separator is now a compile-time guarantee. - count > u16::MAX returns OreError::TooManyBlocks (new variant) in both encrypt_var and encrypt_left_var, instead of a debug_assert. - Single-key init: the chained scheme is single-key by design (the CMAC accumulator derives every per-block secret from one key via branch tags), so drop the vestigial _k2 parameter — init(k1). (Bit6 still uses two PRF keys.) - Add compare_left_to_full: the Lewi-Wu query path. The comparator core is factored into compare_views(left_view, full); compare_raw_slices (full,full) and compare_left_to_full (left,full) both call it, so the left-only artifact from encrypt_left_* is usable as a query against a stored ciphertext. Adds prop_left_query_matches_full + an artifact-rejection unit test. No wire-format change.
Review: code / constant-time / zeroizationRan the multi-agent code review plus the Trail of Bits Applied
Constant-timeInherits Bit6's posture: CT-1 fixed ( ZeroizationTwo new gaps found and fixed: Open (not blocking)No pinned wire vectors yet (deliberately deferred to A2 sign-off — the freeze gate); a few low test-gaps noted in the review. |
…PR 6 foundation) - primitives/cmac.rs: incremental AES-CMAC (NIST SP 800-38B) accumulator per the A2 spec — absorb (CBC prefix) / finalize (E_k(S^F^K1)); prefix/final block encoders with the injective byte layout; K1 reuses gf128_double. Validated against the NIST AES-128 CMAC test vectors. - prp.rs: factor LemireFyPrp::from_stream (shape ii) out of new(); new() now calls it, so Bit6 vectors stay byte-identical. Lets the chained scheme key the PRP from the accumulator stream with no per-block AES key schedule. - hash.rs: gf128_double_u128 -> pub(crate) (shared by H and CMAC subkeys).
…on (PR 6)
OreAes128Bit6Chained: variable-length BlockORE over 6-bit blocks, driving
the AES-CMAC accumulator (A2 spec). Per block: PRP keystream from the
accumulator's PRP_STREAM branch via LemireFyPrp::from_stream (shape ii, no
per-block key schedule); ro_keys from the RO_KEY branch; left tag
f[n]=ro(n,xt[n]); right block = H-mask XOR indicator (reuses the BHKR
sigma-MMO H + oblivious compare read). Vec-backed Var{Left,Right,CipherText}
with the v2 header (scheme id 0x03, u16 block count).
- encrypt_var / encrypt_left_var / encrypt_str / encrypt_left_str.
- compare_raw_slices: constant-time prefix scan, lexicographic (shorter
sorts first); rejects mismatched scheme/version/length.
- No total-length binding in the derivation, so prefix-sharing strings of
different lengths compare correctly (spec section 7).
- Tests: cross-length lexicographic order, equality across nonces, >14-block
plaintexts, roundtrip, empty string, cross-scheme rejection.
- benches/chained.rs.
Perf (M1 Max): ~0.69 us/block (encrypt), below fixed-N Bit6's ~0.81 us/block
-- the shape-ii no-key-schedule win. compare ~402ns.
Add docs/benchmarks/2026-06-15-chained-results.md (~0.69 us/block, below fixed-N Bit6's ~0.81 -- the shape-ii no-key-schedule win) and a PR 6 status note in the plan roadmap.
…cy, and CMAC accumulator
Adds quickcheck properties matching the crate's existing idiom:
scheme/chained.rs (6 props, lengths capped at 32 blocks so most cases
exceed the 14-block fixed-N cap):
- block-level order == blocks.cmp over the full 6-bit domain
- shared-prefix order (drives the constant-time prefix scan)
- string order == str::cmp across arbitrary Unicode and lengths
- equality-across-nonces with byte divergence
- left ct is deterministic and byte-identical to the full ct's left half
- serialize round-trip
primitives/cmac.rs (3 props):
- incremental absorb/finalize == from-scratch CMAC for any block count
(NIST vectors only cover t in {1,4})
- final/prefix block encodings are injective (tuple recoverable)
All pass at 4000 iterations each; clippy -D warnings and fmt clean.
…nit, left-query comparator (#83 review) Review follow-ups on the chained scheme: - zeroize: wipe L = E_k(0) in CmacAccumulator::new (K1's source; spec section 9) and the per-block `ro` RO-tag buffer in encrypt_var (matches the bit2_w6 scratch wipe). Existing Drops (k1/state/k_acc/stream) were already correct. - Replace the BRANCH_* u8 consts + a would-be debug_assert with a non-zero enum Branch { RoKey = 1, PrpStream = 2 }; final_block takes Branch, so a final block's byte 0 can never be 0x00 and collide with a prefix block — the injectivity separator is now a compile-time guarantee. - count > u16::MAX returns OreError::TooManyBlocks (new variant) in both encrypt_var and encrypt_left_var, instead of a debug_assert. - Single-key init: the chained scheme is single-key by design (the CMAC accumulator derives every per-block secret from one key via branch tags), so drop the vestigial _k2 parameter — init(k1). (Bit6 still uses two PRF keys.) - Add compare_left_to_full: the Lewi-Wu query path. The comparator core is factored into compare_views(left_view, full); compare_raw_slices (full,full) and compare_left_to_full (left,full) both call it, so the left-only artifact from encrypt_left_* is usable as a query against a stored ciphertext. Adds prop_left_query_matches_full + an artifact-rejection unit test. No wire-format change.
7458510 to
92833dd
Compare
f498990 to
7ab2529
Compare
Stacked on #82. Implements PR 6 of the ORE v2 program: the chained-prefix, variable-length scheme that enables string encryption (and lifts the fixed-N 14-block packed-prefix cap).
Design (preliminary crypto sign-off given; detailed review in progress)
docs/plans/2026-06-12-ore-v2-architecture.md§5(b)docs/plans/2026-06-15-ore-v2-cmac-accumulator-spec.md— the precise, reviewed-against construction (injective encoding, security argument, open questions). (Both live in the base ORE v2 (5/n): 6-bit block scheme + v2 wire format #82 commits; this PR adds the implementation + a benchmark-results doc + the PR-6 status note.)What
primitives/cmac.rs— incremental AES-CMAC accumulator (NIST SP 800-38B).absorb(CBC prefix) /finalize(E_k(S⊕F⊕K1)); injective prefix/final block encoders;K1reuses the vectorizedgf128_double. Validated against the NIST AES-128 CMAC test vectors.LemireFyPrp::from_stream(shape ii) — PRP keyed directly from the accumulator'sPRP_STREAMbranch, so there is no per-block AES key schedule.new()was refactored to call it, keeping the Bit6 vectors byte-identical.scheme/chained.rs—OreAes128Bit6Chained:encrypt_var/encrypt_left_var/encrypt_str/encrypt_left_str; Vec-backedVar{Left,Right,CipherText}with the v2 header (scheme id0x03, u16 block count);compare_raw_slicesdoing a constant-time prefix scan with lexicographic semantics (shorter sorts first), reusing the BHKR σ-MMOHand the oblivious compare read.k1by a labelled AES call), subsuming the fixed-Nprf1/prf2via branch tags.Tests
Cross-length lexicographic order vs
str::cmp(incl. shared prefixes and >14-block strings), equality across nonces, serialize round-trip, empty string sorts first, cross-scheme byte rejection; plus the CMAC NIST KAT and accumulator unit tests. All local CI gates green (fmt, clippy-D warnings, full test).Performance (Apple M1 Max —
docs/benchmarks/2026-06-15-chained-results.md)"alice"(7 blocks)"alice@example.com"(23 blocks)~0.69 µs/block, scaling linearly — and below fixed-N Bit6's ~0.81 µs/block, because the accumulator is keyed once per ciphertext (the shape-(ii) no-key-schedule win, A3).
Not in this PR
u128/i128/Decimalvia the const-N + accumulator path (fixed-length, >14 blocks).Review notes
The A2 spec has 5 open questions (§11) — key-derivation slot, width-vs-scheme-id tag, shape-(ii) confirmation, birthday ceiling,
u128/Decimalcomparability. None block the working implementation; they shape the final freeze.