diff --git a/Cargo.lock b/Cargo.lock
index 15cb51f3a..f6679ad42 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -660,6 +660,17 @@ dependencies = [
  "serde",
 ]
 
+[[package]]
+name = "pq_das"
+version = "0.1.0"
+dependencies = [
+ "backend",
+ "clap",
+ "lean_compiler",
+ "lean_prover",
+ "lean_vm",
+]
+
 [[package]]
 name = "prettyplease"
 version = "0.2.37"
diff --git a/PQ-DAS-V1/LeanVM Implementation Plan.md b/PQ-DAS-V1/LeanVM Implementation Plan.md
new file mode 100644
index 000000000..70a7dbbb6
--- /dev/null
+++ b/PQ-DAS-V1/LeanVM Implementation Plan.md	
@@ -0,0 +1,431 @@
+# PQ-DAS V1 on LeanVM: Implementation Plan
+
+## 1. Executive recommendation
+
+For the first end-to-end demo, use the following profile:
+
+| Item | V1 demo choice |
+|---|---|
+| LeanVM version | Pin commit `2d4148a6082c85769dbf5ef7f3dd61cade17eeb4` (v0.8 / `spec-latest`) |
+| Data and RS base field | KoalaBear, `p = 2^31 - 2^24 + 1 = 0x7f000001` |
+| RS domain | Multiplicative two-adic domain of size `m = 2^s`, `s <= 24` |
+| RS rate | `rho = 1/2` initially |
+| RS layout | Systematic: interpolate the first `k=m/2` evaluations, then evaluate on all `m` points |
+| Cell size | `c = 8` KoalaBear elements |
+| Application hash | LeanVM-native Poseidon1, width 16, rate 8 |
+| Digest | 8 KoalaBear elements |
+| Merkle tree | Binary; Poseidon sponge for leaves and Poseidon feed-forward compression for internal nodes |
+| RS membership | Special `rho=1/2` barycentric check, with challenge and arithmetic in the quintic extension |
+| Public input | One 8-element Poseidon digest committing to the full typed statement and parameter profile |
+| LeanVM proof rate | Benchmark WHIR rates `1/2`, `1/4`, `1/8`, and `1/16`; start with `1/2` |
+| Security mode | Default proven/Johnson-bound mode, not `prox-gaps-conjecture` |
+
+Do not make the first demo generic over the base field or hash inside the LeanVM
+guest. Those are currently native, effectively fixed parts of LeanVM. Make the
+application layer generic over parameter profiles so that alternate constructions
+can be benchmarked as separate compiled programs or, for a different field/hash,
+against a separate LeanVM build.
+
+## 2. Corrections required in the construction document
+
+### 2.1 Symbols and cells are currently conflated
+
+The document first defines `w[i,j]` as a symbol and later reuses it as a cell.
+Use distinct notation:
+
+- `x[i,s]` for field symbol `s`, where `0 <= s < m`.
+- `cell[i,j] = x[i,j*c .. (j+1)*c]`, where `0 <= j < ell=m/c`.
+- A queried column `j` contains `n*c` field elements, not `n` field elements.
+
+If queries return cells, the reconstruction threshold is:
+
+`t_cells = ceil(k/c)`,
+
+provided cells align with consecutive evaluation positions. The current `t=k`
+is correct only when one query returns one field symbol.
+
+### 2.2 Hash inputs need exact serialization and domain separation
+
+Replace every ambiguous concatenation with a typed field-element transcript.
+At minimum, bind:
+
+- protocol/version identifier;
+- parameter-profile identifier;
+- `n`, `m`, `k`, `c`, `ell`, RS check type;
+- row index for row hashes;
+- column index for column hashes;
+- exact logical input length;
+- all row hashes and the column Merkle root.
+
+Use separate domain tags for:
+
+- row/systematic-data hash;
+- column/cell hash;
+- Merkle leaf;
+- public statement;
+- RS membership challenge.
+
+For Merkle internal nodes, V1 should use the dedicated feed-forward compression
+primitive. Its operation and fixed `digest || digest` input shape provide
+structural separation from the tagged variable-length leaf hash. Adding an
+explicit internal-node tag would require an additional permutation or a
+different tree construction.
+
+### 2.3 One base-field linear check is not sufficiently sound
+
+A single random linear identity over KoalaBear has failure probability roughly
+`1/p`, only about 31 bits. Use a challenge in LeanVM's degree-5 extension field
+and evaluate the membership identity there. This gives an algebraic error scale
+near `1/p^5` before accounting for the rest of the proof system.
+
+LeanVM exposes `dot_product_be`, which computes a dot product between extension
+and base-field vectors. It is the natural precompile for this check.
+
+### 2.4 Fiat-Shamir challenge derivation must be verifier-reproducible
+
+The RS challenge cannot be sampled privately by the prover. It must be derived
+from a domain-separated transcript that already binds:
+
+- the complete parameter profile;
+- all `h_i`;
+- the Merkle root;
+- optionally a batch/slot/block identifier.
+
+The host verifier must derive the same challenge. Any precomputed check vector
+`L` must either be recomputed by the verifier and bound into the statement, or
+be derived and checked inside the guest.
+
+### 2.5 Systematic RS encoding must be stated precisely
+
+For `U = {omega^0, ..., omega^(m-1)}`, define `P_i` as the unique degree-`<k`
+polynomial satisfying:
+
+`P_i(omega^j) = b[i,j]` for `0 <= j < k`.
+
+Then set `x[i,j] = P_i(omega^j)` for all `0 <= j < m`. This makes the first
+`k` evaluations systematic without confusing coefficients and evaluations.
+
+## 3. LeanVM compatibility findings
+
+### 3.1 Fields
+
+LeanVM VM arithmetic uses:
+
+- Base field `F = KoalaBear`.
+- Prime `p = 2^31 - 2^24 + 1`.
+- Two-adicity `24`, so power-of-two RS domains up to `2^24` exist.
+- Proof extension field `EF = QuinticExtensionFieldKB`.
+
+The guest language's ordinary values and memory cells are base-field elements.
+Extension-field operations are available through the extension precompile.
+
+Recommendation: use KoalaBear for the application RS code in V1. Using another
+RS field would require non-native arithmetic inside the VM and would erase the
+main performance advantage.
+
+### 3.2 Hash
+
+The native permutation is Poseidon1 over KoalaBear:
+
+- width `16`;
+- rate `8`;
+- capacity `8`;
+- digest length `8` field elements.
+
+There are three relevant operations/conventions that must not be mixed:
+
+- Application slice hash: `poseidon_hash_slice`, a left-to-right overwrite
+  sponge using the raw Poseidon permutation and a length IV. The zkDSL
+  `slice_hash` implementation matches it.
+- WHIR-internal leaf hash: `hash_slice_rtl`, which absorbs in the opposite
+  direction. It is an internal proving-system convention and should not be used
+  as the PQ-DAS application hash.
+- Merkle node hash: Poseidon feed-forward compression of two 8-element digests.
+
+The repository has both host implementations and guest precompiles. Reuse their
+exact semantics rather than implementing a new Poseidon wrapper. Add cross-tests
+that compare every host hash against its zkDSL result.
+
+An 8-element digest carries about 248 raw field bits, but generic collision
+security is about 124 bits. This matches LeanVM's current stated proof target,
+not a full 128-bit target.
+
+### 3.3 Public input and witness model
+
+LeanVM has exactly 8 public base-field elements. The recommended pattern is:
+
+1. Host serializes the complete public PQ-DAS statement into field elements.
+2. Host computes its 8-element Poseidon digest and uses that as public input.
+3. The serialized statement and codeword matrix are supplied as named witness
+   blobs via `ExecutionWitness`.
+4. The guest recomputes the statement digest and asserts equality with public
+   memory.
+
+This lets `com = ({h_i}, root, pi)` remain externally readable while fitting
+LeanVM's fixed public-input surface.
+
+### 3.4 Proving-system parameters
+
+Current defaults:
+
+- degree-5 extension;
+- claimed proven security about 124 bits;
+- 16 grinding bits;
+- WHIR starting inverse rates `2^1` through `2^4`;
+- Johnson-bound mode by default;
+- optional conjectural proximity-gap feature;
+- maximum memory `2^26` field elements;
+- maximum execution rows `2^24`;
+- maximum Poseidon and extension-precompile rows `2^21`.
+
+These table limits, especially `2^21` Poseidon rows, must be enforced by the
+parameter validator before proving.
+
+## 4. Encoding and hashing profile
+
+### 4.1 External bytes to KoalaBear
+
+Define this boundary explicitly. Recommended canonical V1 encoding:
+
+- Pack 3 little-endian bytes into one KoalaBear element.
+- Record the original byte length in the statement.
+- Reject non-canonical field encodings on decode.
+
+Three-byte limbs are always below the KoalaBear modulus and avoid rejection
+sampling. Also benchmark one-byte-per-field-element encoding as a simple
+reference. Do not directly reduce 32-bit words modulo `p`, because that is not
+injective.
+
+If the upstream data already consists of canonical KoalaBear elements, bypass
+byte packing and mark the input encoding in the parameter profile.
+
+### 4.2 Cell size
+
+Start with `c=8` because it aligns with the Poseidon rate:
+
+- row systematic data is naturally hashed in 8-element chunks;
+- one row's contribution to a cell is one Poseidon rate block;
+- padding is minimized;
+- query and reconstruction accounting is simple.
+
+Benchmark `c in {4, 8, 16, 32}` later. Larger cells reduce Merkle height and the
+number of distinct queried cells needed for reconstruction, but increase query
+bandwidth and column-hash work.
+
+### 4.3 Column leaves and Merkle tree
+
+For column/cell index `j`, serialize:
+
+`tag_column || profile_id || j || n || c || cell[0,j] || ... || cell[n-1,j]`
+
+Pad to a multiple of 8 with an unambiguous length-bearing scheme, then hash with
+the LeanVM Poseidon slice hash.
+
+Build a binary tree over exactly `ell` leaves. For V1 require `ell` to be a power
+of two. Later, specify a fixed padding-leaf rule if non-power-of-two sizes are
+needed.
+
+Do not add an extra Merkle leaf hash over an already hashed `d_j` unless both
+guest and external verifier intentionally use that two-level convention.
+
+## 5. RS membership implementation
+
+### 5.1 V1 path: special barycentric check
+
+Use `rho=1/2`, `m=2k`, and the even/odd split from the design.
+
+Recommended flow:
+
+1. Derive an extension-field challenge `p_ext` from the typed public statement.
+2. Set `q_ext = p_ext / omega`.
+3. Compute or witness barycentric coefficients in the extension field.
+4. If coefficients are witnessed, check each claimed inverse with a
+   multiplication identity before use.
+5. For each row, use `dot_product_be` for the even and odd halves.
+6. Assert `A_i(p_ext) - B_i(q_ext) = 0` in all five extension coordinates.
+7. Process rows with `parallel_range`.
+
+This path has linear guest work per row and avoids an in-guest FFT.
+
+### 5.2 Alternative paths for benchmarking
+
+Implement behind a compile-time `MembershipCheck` profile:
+
+- `SpecialBarycentricHalfRate`: first demo and expected best V1 option.
+- `ParityExtension`: host computes the randomized parity vector over the
+  extension field; verifier recomputes it; guest checks its binding and performs
+  extension/base dot products.
+- `GeneralBarycentricExtension`: supports arbitrary systematic subsets/rates,
+  with higher coefficient-generation cost.
+
+Do not benchmark a single base-field check as a production candidate. It can be
+kept only as an explicitly insecure performance lower bound.
+
+### 5.3 Batch soundness across rows
+
+Checking each row with the same random dual-code vector is acceptable only after
+a written soundness argument for the batched statement. For a conservative
+first implementation, derive independent row challenges from:
+
+`tag_rs || statement_digest || row_index`.
+
+Then add a benchmark mode using a shared challenge to quantify the performance
+benefit before deciding whether to adopt it.
+
+## 6. Software architecture
+
+Add a new workspace crate, tentatively `crates/pq_das`, with:
+
+```text
+crates/pq_das/
+  src/
+    config.rs          Typed parameter profiles and validation
+    encoding.rs        Bytes <-> KoalaBear and systematic RS encoding
+    hashing.rs         Typed Poseidon transcript and domain tags
+    merkle.rs          Column commitment/open/verify
+    membership.rs      Host-side challenge/coefficient generation
+    statement.rs       Canonical public statement serialization
+    prover.rs          LeanVM witness construction and prove_execution wrapper
+    verifier.rs        Native transcript and LeanVM proof verification
+    reconstruction.rs  Cell collection and RS erasure reconstruction
+    benchmark.rs       Structured JSON benchmark reports
+  zkdsl/
+    main.py
+    hashing.py
+    membership.py
+    utils.py
+  tests/
+```
+
+Keep host and guest constants generated from one Rust `ParameterProfile`.
+Compile separate bytecode per profile using `CompilationFlags::replacements`.
+The bytecode hash is already included in LeanVM's Fiat-Shamir domain separation,
+so proof verification remains bound to the selected compiled program.
+
+## 7. Implementation workflow
+
+### Phase 0: freeze the executable specification
+
+- Resolve the symbol/cell notation and reconstruction threshold.
+- Freeze canonical serialization and all domain tags.
+- Decide whether RS row challenges are independent or shared.
+- Specify malformed-input behavior and exact bounds.
+- Add test vectors for byte encoding, Poseidon hashes, RS encoding, Merkle roots,
+  and membership coefficients.
+
+Exit criterion: Rust and a small independent reference script produce identical
+vectors.
+
+### Phase 1: native host implementation
+
+- Implement parameter validation.
+- Implement bytes-to-KoalaBear encoding.
+- Implement systematic RS encode/reconstruct.
+- Implement typed row/column hashing and Merkle open/verify.
+- Implement all three membership checks natively.
+- Add corruption tests for every row, symbol, digest, root, and path.
+
+Exit criterion: `Com`, query verification, and `Ext` work without LeanVM.
+
+### Phase 2: minimal LeanVM relation
+
+- Create a zkDSL guest that reads a small fixed-size matrix.
+- Recompute the statement digest.
+- Recompute row hashes and column hashes.
+- Recompute the Merkle root.
+- Implement only `SpecialBarycentricHalfRate`.
+- Wire `compile_program_with_flags`, `ExecutionWitness`,
+  `prove_execution`, and `verify_execution`.
+
+Start with a tiny profile such as:
+
+`n=2, m=16, k=8, c=8`.
+
+Exit criterion: one valid proof verifies and one mutation in every witness
+component fails.
+
+### Phase 3: scalable guest
+
+- Replace fixed loops with bounded runtime loops where useful.
+- Use `parallel_range` across independent rows and columns.
+- Keep lengths and maxima compile-time bounded.
+- Add preflight estimates for memory, execution rows, Poseidon rows, and
+  extension rows.
+- Increase through profiles until table limits or RAM become the bottleneck.
+
+Exit criterion: repeatable proofs for representative DAS sizes with no hidden
+out-of-range or unchecked dynamic indices.
+
+### Phase 4: complete protocol API
+
+- Expose `setup`, `commit`, `query`, `verify_query`, and `reconstruct`.
+- Define serializable `Commitment`, `AuxData`, `Transcript`, and `ProofBundle`.
+- Add deterministic query sampling from an explicitly specified randomness
+  source.
+- Support multiproofs or deduplicated Merkle paths as a later optimization.
+
+Exit criterion: an end-to-end integration test covers commit, proof, sampling,
+query verification, and reconstruction.
+
+### Phase 5: benchmark and optimize
+
+- Add JSON output with profile ID, git commit, machine data, and security mode.
+- Profile guest instruction counts and all LeanVM table heights.
+- Optimize only after separating RS, hashing, Merkle, witness generation,
+  proving, verification, proof size, and query bandwidth.
+
+## 8. Benchmark matrix
+
+Sweep one axis at a time:
+
+| Axis | Initial values |
+|---|---|
+| Rows `n` | `1, 2, 4, 8, 16, 32` |
+| RS size `m` | `2^8, 2^10, 2^12, 2^14` |
+| RS rate `rho` | `1/2`, then `1/4` |
+| Cell size `c` | `4, 8, 16, 32` |
+| Membership | special barycentric, parity, general barycentric |
+| Row challenge | independent, shared |
+| WHIR rate | `1/2, 1/4, 1/8, 1/16` |
+| Security assumption | proven default; conjectural reported separately |
+| Byte encoding | 3-byte limbs; 1-byte reference |
+
+Record:
+
+- encoding and reconstruction time;
+- native hash/Merkle time;
+- witness generation time;
+- execution cycles;
+- execution, Poseidon, and extension table rows;
+- proving time and peak RSS;
+- verification time;
+- proof size;
+- commitment size;
+- query response bytes;
+- minimum distinct cells required for reconstruction.
+
+Never compare benchmark rows without including the LeanVM commit, compiled
+bytecode hash, CPU features, thread count, and security mode.
+
+## 9. Recommended first three deliverables
+
+1. **Specification patch and native vectors**: fix the V1 document and publish
+   canonical vectors for one tiny and one medium profile.
+2. **Tiny LeanVM proof demo**: `n=2, m=16, k=8, c=8`, full relation, negative
+   tests, and JSON timing.
+3. **Parameter sweep demo**: compile-time profiles for several `n/m/c` values,
+   all four WHIR rates, and at least two membership checks.
+
+## 10. Decisions to defer
+
+Do not block the first demo on:
+
+- replacing KoalaBear or Poseidon;
+- 128-bit rather than current approximately 124-bit LeanVM security;
+- non-power-of-two RS or Merkle domains;
+- a production network sampling policy;
+- Merkle multiproofs;
+- recursive aggregation of many PQ-DAS proofs.
+
+These should remain explicit roadmap items. A different base field or hash is a
+LeanVM backend experiment, not merely a PQ-DAS runtime parameter.
diff --git a/PQ-DAS-V1/PQ-DAS Optimization Checklist.md b/PQ-DAS-V1/PQ-DAS Optimization Checklist.md
new file mode 100644
index 000000000..4925928b9
--- /dev/null
+++ b/PQ-DAS-V1/PQ-DAS Optimization Checklist.md	
@@ -0,0 +1,73 @@
+# PQ-DAS Optimization Checklist
+
+## A. Profiling and Cost Breakdown
+
+- [ ] **A1. Add detailed LeanVM prover profiling:** report actual and padded rows for every LeanVM table.
+- [ ] **A2. Add stage-level prover timing:** separately measure bytecode execution, trace generation, memory access counting, logup, AIR sumcheck, WHIR, and grinding.
+- [ ] **A3. Enable LeanVM VM profiling and tracing:** expose instruction, memory, Poseidon, and extension-operation statistics.
+- [ ] **A4. Add relation-isolation benchmarks:** benchmark row hashing, column hashing plus Merkle construction, and RS membership independently.
+
+## B. LeanVM Proof Optimization
+
+- [ ] **B1. Implement a PQ-DAS-specific LeanVM precompile/AIR:** directly constrain row hashes, the column Merkle root, and RS membership.
+- [ ] **B2. Add a batched Poseidon sponge precompile:** process all row hashes or all column hashes in batches.
+- [ ] **B3. Add a strided Poseidon input mode:** hash systematic row symbols directly without copying them into a temporary array.
+- [ ] **B4. Add a gathered Poseidon input mode:** hash column cells directly across rows without constructing `column_data`.
+- [ ] **B5. Fuse column hashing with Merkle construction:** stream leaf digests into a Merkle accumulator without storing the full tree in VM memory.
+- [ ] **B6. Add a batched matrix-vector membership precompile:** compute all $\langle w_i,L\rangle$ checks while reusing the same public $L$.
+- [ ] **B7. Add a structured fixed-column segment for $L$:** remove the $5m$ coordinates of $L$ from generic VM memory while keeping verifier-side derivation mandatory.
+- [ ] **B8. Store codewords in dedicated AIR trace columns:** avoid placing the complete $n\times m$ codeword matrix in generic VM memory.
+- [ ] **B9. Reduce generic VM memory traffic:** eliminate unnecessary intermediate arrays, reads, writes, and memory-bus lookups in the guest.
+
+## C. Parallelism and Low-Level Prover Optimization
+
+- [ ] **C1. Parallelize row-hash guest loops.**
+- [ ] **C2. Parallelize column-hash guest loops.**
+- [ ] **C3. Parallelize each independent Merkle-tree level.**
+- [ ] **C4. Parallelize RS membership checks across rows.**
+- [ ] **C5. Parallelize prover memory access-count construction:** use thread-local accumulators followed by reduction.
+- [ ] **C6. Parallelize bytecode access-count construction.**
+- [ ] **C7. Parallelize independent trace-table processing.**
+- [ ] **C8. Enable native CPU optimization:** benchmark builds using `RUSTFLAGS="-C target-cpu=native"`.
+- [ ] **C9. Verify SIMD utilization:** confirm AVX2 or AVX-512 Poseidon and packed KoalaBear arithmetic are active.
+- [ ] **C10. Tune LeanVM worker-thread count for the benchmark machine.**
+- [ ] **C11. Add NUMA-aware trace allocation and worker placement.**
+- [ ] **C12. Reuse prover scratch buffers and avoid repeated zero-initialization.**
+- [ ] **C13. Evaluate GPU acceleration for WHIR FFTs, Poseidon, Merkle commitments, and sumcheck.**
+
+## D. WHIR Configuration Optimization
+
+- [ ] **D1. Benchmark `whir_log_inv_rate` values 1 through 4:** compare prover time, verifier time, proof size, and memory.
+- [ ] **D2. Tune the initial WHIR folding factor while preserving at least 124-bit security.**
+- [ ] **D3. Tune the subsequent WHIR folding factor while preserving at least 124-bit security.**
+- [ ] **D4. Tune the RS-domain initial reduction factor.**
+- [ ] **D5. Tune the grinding-bit and query-count allocation while preserving total soundness.**
+- [ ] **D6. Tune `MAX_NUM_VARIABLES_TO_SEND_COEFFS`.**
+
+## E. Host and Engineering Optimization
+
+- [ ] **E1. Cache profile-specific compiled bytecode templates.**
+- [ ] **E2. Bind commitment-specific row hashes, root, and $L$ without recompiling the guest.**
+- [ ] **E3. Cache FFT roots, twiddle factors, domains, and reusable FFT plans.**
+- [ ] **E4. Cache reconstruction locator data when multiple rows use the same erasure pattern.**
+- [ ] **E5. Implement a zero-copy contiguous codeword representation.**
+- [ ] **E6. Remove the witness flattening copy before LeanVM execution.**
+- [ ] **E7. Parallelize native RS encoding across blobs.**
+- [ ] **E8. Parallelize native row and column commitment hashing.**
+- [ ] **E9. Parallelize reconstruction across rows.**
+- [ ] **E10. Add reusable benchmark automation and machine-configuration reporting.**
+
+## F. Protocol-Level Optimization Candidates
+
+- [ ] **F1. Evaluate row-sharded parallel proofs:** measure latency, total proof size, and verifier cost.
+- [ ] **F2. Evaluate recursive aggregation of row-sharded proofs.**
+- [ ] **F3. Benchmark alternative cell sizes $c$:** measure Merkle cost, sample size, sampling count, and network granularity.
+- [ ] **F4. Benchmark alternative RS code rates $\rho=k/m$:** recompute reconstruction and availability soundness for every rate.
+- [ ] **F5. Design a V2 commitment that avoids hashing systematic symbols independently in both row and column commitments.**
+
+## Security Constraints
+
+- [ ] **S1. Keep LeanVM and DAS sampling soundness at or above the configured security target.**
+- [ ] **S2. Keep verifier-side Fiat-Shamir challenge and $L$ reconstruction mandatory.**
+- [ ] **S3. Keep row hashes, column commitment root, and RS membership inside the proved relation.**
+- [ ] **S4. Do not reduce the extension degree, digest length, or proof-system security solely for performance.**
diff --git a/PQ-DAS-V1/PQ-DAS V1 Design.md b/PQ-DAS-V1/PQ-DAS V1 Design.md
new file mode 100644
index 000000000..e8d2c2c57
--- /dev/null
+++ b/PQ-DAS-V1/PQ-DAS V1 Design.md	
@@ -0,0 +1,143 @@
+---
+title: PQ-DAS V1 Design
+
+---
+
+# PQ-DAS V1 Design 
+
+## Purpose
+
+This document collects the first version of the PQ-DAS Encoding + Prove design, creates the spreadsheet with flexible variables in the design, and records the local implementation benchmarks in the spreadsheet.
+
+## DAS Construction
+![V1 Design Diagram - 2](https://hackmd.io/_uploads/ryqsi9rxGe.png)
+
+- The Setup algorithm $\mathsf{Setup}(1^{\lambda}) \rightarrow {\sf pp}$: 
+    1. Choose a hash function $\mathsf{H}: \{0, 1\}^* \rightarrow \{0, 1\}^{\lambda}$.
+    2. Define the Reed-Solomon (RS) code ${\sf RS}[\mathbb{F}, {\sf U}, \rho]$ and its corresponding encoding algorithm $\mathcal{C}: \mathbb{F}^k \rightarrow \mathbb{F}^m$, where $\mathbb{F}$ is a finite field, ${\sf U}$ is the evaluation domain, $\rho$ is the code rate, $k$ is the length of the input vector, $m = |{\sf U}|$ is the length of the output vector, which satisfies $k = \rho m$.
+    3. Define the number of field elements $c$ in a cell. 
+    4. Define the reconstruction threshold $t = \left\lceil \frac{k}{c} \right\rceil$ in cells.
+    5. Define the public parameters used in LeanVM as $\mathsf{pp}_{\sf STARK}$.
+    6. Output $\mathsf{pp} = (\mathsf{H}, \mathbb{F}, {\sf U}, m, k, \rho, c, t, \mathsf{pp}_{\sf STARK})$.
+    
+- The encoding algorithm $\mathsf{Com}({\sf pp}, {\sf data}) \rightarrow ({\sf com}, {\sf \tau})$:
+    1. Parse ${\sf data}$ into a set of blobs ${\sf data} = (b_1, ..., b_n)$, each blob has $k$ symbols.
+    2. RS Encode each blob such that each codeword has $m$ symbols, i.e., $\forall i \in [1, n]$: $\mathcal{C}(b_i) = w_i = (w_{i,1},\ldots,w_{i,m}) \in \mathbb{F}^m$. For each codeword, the first $k$ symbols are the systematic data, i.e., $\forall i \in [1, n], s \in [1, k]: w_{i, s} = b_{i, s}$.
+    3. Form a matrix such that each row is a codeword $w_i$. We group every $c$ consecutive field elements in each $w_i$ as a cell, so there are in total $\ell = \frac{m}{c}$ cells on each row. We use $W_{i,j} = (w_{i,(j-1)c+1},\ldots,w_{i,jc}) \in \mathbb{F}^c$ to denote the $j$-th cell on the $i$-th row in the matrix.
+    4. Hash the systempatic part of each codeword, i.e., $\forall i \in [1, n]$: $h_i = \mathsf{H}(w_{i, 1} \parallel \cdots \parallel w_{i, k})$.
+    5. Hash each column of cells in the matrix, i.e., $\forall j \in [1, \ell]$: $d_j = \mathsf{H}(W_{1, j} \parallel \cdots \parallel W_{n, j})$.
+    6. Generate a Merkle tree for all column digests $d_j$, i.e., ${\sf root} = \mathsf{Merkle.Com}(d_1 \parallel \cdots \parallel d_{\ell})$.
+    7. Use LeanVM to generate a STARK proof $\pi \leftarrow {\sf LeanVM}.{\sf Prove}({\sf pp}_{\sf STARK}, {\sf stmt}, {\sf witn}, \mathcal{R})$, where
+    \begin{aligned}
+    \mathcal{R}
+    =
+    \{(\mathsf{stmt},\mathsf{witn}) \;:\;&
+    \mathsf{stmt}=(\{h_i\}_{i\in[1,n]},\mathsf{root}),
+    \ \mathrm{witn}= \{w_i\}_{i \in [1, n]},\\
+    &
+    \forall i\in[1,n],\;
+    h_i = \mathsf{H}(w_{i, 1} \parallel \cdots \parallel w_{i, k}),\\
+    &
+    \forall j\in[1, \ell],\;
+    d_j=\mathsf{H}(
+    W_{1,j} \parallel \cdots\parallel W_{n,j}
+    ),\\
+    &
+    \mathrm{root}
+    =
+    \mathsf{Merkle.Com}(d_1\parallel\cdots\parallel d_{\ell}),\\
+    &
+    \forall i\in[1,n],\;
+    w_i\in \mathrm{RS}[\mathbb{F},\mathrm{U},\rho]
+    \}.
+    \end{aligned}
+    8. Open the Merkle authentication paths for all column digests i.e., $\{{\sf auth}_j\}_{j \in [1, \ell]} = {\sf Merkle.Open}(d_1, ..., d_{\ell}, {\sf root})$. 
+    9. Output ${\sf com} = (\{h_i\}_{i \in [1, n]}, {\sf root}, \pi)$, ${\sf \tau} = \left (\{w_i\}_{i \in [1, n]}, \{{\sf auth}_j\}_{j \in [1, \ell]}\right)$.
+
+- The query algorithm ${\sf V}^{\pi, Q}_1({\sf com}) \rightarrow {\sf tran}$: 
+    1. Generate the query index set $Q \leftarrow {\sf Sample}(1^{\lambda})$. 
+    2. Set the transcript ${\sf tran} = (Q, \{W_{1, j}, ..., W_{n, j}, {\sf auth}_j\}_{j \in Q})$.
+
+- The verification algorithm ${\sf V}_2({\sf com}, {\sf tran}) \rightarrow b$:
+    1. Verify the STARK proof: Check if ${\sf LeanVM}.{\sf Verify}({{\sf pp}_{\sf STARK}, \sf stmt}, \pi) = 1$.
+    2. Verify the openings: Compute $\forall j \in Q: d_j = \mathsf{H}(W_{1, j} \parallel \cdots \parallel W_{n, j})$, check if ${\sf Merkle}.{\sf Verify}({\sf root}, \{d_j, {\sf auth}_j\}_{j \in Q}) = 1$.
+    3. If all checks passes, output $b = 1$. Otherwise, output $0$.
+
+- The reconstruction algorithm ${\sf Ext}({\sf com}, {\sf tran}_1, ..., {\sf tran}_L) \rightarrow {\sf data}/\bot$:
+    1. For $i \in [1, L]$: Parse ${\sf tran}_i = (Q_i, \{W_{1, j}, ..., W_{n, j}, {\sf auth}_j\}_{j \in Q_i})$.
+    2. Check if $\forall i \in [1, L]$: ${\sf V}_2({\sf com}, {\sf tran}_i) = 1$. Otherwise return $\bot$.
+    3. Find the union set $I$ for all query index sets, i.e., $I = Q_1 \cup Q_2 \cdots \cup Q_L$.
+    4. Check if set $I$ has size over the threshold, i.e., $|I| \geq t$. If not return $\bot$.
+    5. Reconstruct the data from the codeword symbols contained in the cells indexed by $I$, i.e., ${\sf data} = {\sf Reconst}\left(\{W_{1, j}, ..., W_{n, j}\}_{j \in I}\right)$.
+---
+
+## RS Membership Check Instantiations
+
+The condition $w_i\in \mathrm{RS}[\mathbb{F},{\sf U},\rho]$ in the STARK relation can be instantiated by one of the following linear checks.
+
+### 1. Parity-check: $\deg(P_i)<k \Rightarrow c_{i,k}=\cdots=c_{i,m-1}=0$
+
+#### Preprocessing outside the proof:
+
+* Let ${\sf U}=\{\omega^0,\omega^1,\ldots,\omega^{m-1}\}$, where $\omega$ is a primitive $m$-th root of unity.
+* Let $i$ denote the row index, $j$ denote the codeword-symbol index on each row, and $r$ denote the coefficient index of the interpolated polynomial.
+* For each row $w_i$, let $P_i(X)=\sum_{r=0}^{m-1}c_{i,r}X^r$ be the polynomial interpolated from its field symbols, where $\forall j \in[0, m-1]: P_i(\omega^j) = w_{i,j}$.
+* The coefficients are given by $c_{i,r}=\frac{1}{m}\sum_{j=0}^{m-1} w_{i,j}\omega^{-jr}$.
+* Sample $\{\alpha_r\}_{r \in [k,m-1]} \leftarrow \mathsf{H}({\sf pp}, \{h_i\}_{i \in [1, n]}, {\sf root})$.
+* Compute the shared parity-check vector $L=(L_0,\ldots,L_{m-1})$, where $\forall j\in[0,m-1]: L_j=\frac{1}{m}\sum_{r=k}^{m-1}\alpha_r\omega^{-jr}$.
+
+#### Inner product inside the proof: 
+$$ \begin{aligned} \forall i\in[1,n]:\quad \langle L,w_i\rangle &= \sum_{j=0}^{m-1}L_jw_{i,j} = \sum_{j=0}^{m-1}\left(\frac{1}{m}\sum_{r=k}^{m-1}\alpha_r\omega^{-jr}\right)w_{i,j} \\ &= \sum_{r=k}^{m-1}\alpha_r\left(\frac{1}{m}\sum_{j=0}^{m-1}w_{i,j}\omega^{-jr}\right) = \sum_{r=k}^{m-1}\alpha_rc_{i,r} = 0. \end{aligned} $$
+
+### 2. General barycentric check
+
+#### Preprocessing outside the proof:
+* Let ${\sf U} = \{u_0,u_1,\ldots,u_{m-1}\}$.
+* Let $i$ denote the row index, $j$ denote the codeword-symbol index on each row, and $s,t$ denote the interpolation and check positions.
+* Choose $S\subseteq[0,m-1]$ with $|S|=k$, and let $T=[0,m-1]\setminus S$.
+* For each $s\in S$, define the Lagrange basis polynomial $\ell_s(X)$ over ${u_s:s\in S}$, where $\ell_s(u_{s'})=1$ if $s=s'$ and $\ell_s(u_{s'})=0$ otherwise.
+* For each row $w_i$, define $P_i(X)=\sum_{s\in S}\ell_s(X)w_{i,s}$.
+* Sample $\{\alpha_t\}_{t\in T}\leftarrow\mathsf{H}({\sf pp},\{h_i\}_{i \in [1, n]}, {\sf root})$.
+* Compute the shared barycentric-check vector $L=(L_0,\ldots,L_{m-1})$, where $\forall t\in T:L_t=\alpha_t$ and $\forall s\in S:L_s=-\sum_{t\in T}\alpha_t\ell_s(u_t)$.
+
+#### Inner product inside the proof: 
+$$ \begin{aligned} \forall i\in[1,n]:\quad \langle L,w_i\rangle &= \sum_{j=0}^{m-1}L_jw_{i,j} = \sum_{t\in T}L_tw_{i,t}+\sum_{s\in S}L_sw_{i,s} \\ &= \sum_{t\in T}\alpha_tw_{i,t} -\sum_{s\in S}\left(\sum_{t\in T}\alpha_t\ell_s(u_t)\right)w_{i,s} \\ &= \sum_{t\in T}\alpha_t\left(w_{i,t}-\sum_{s\in S}\ell_s(u_t)w_{i,s}\right) = \sum_{t\in T}\alpha_t\left(w_{i,t}-P_i(u_t)\right) = 0. \end{aligned} $$
+
+### 3. Special barycentric check $(\rho = 1/2)$:
+
+#### Preprocessing outside the proof:
+
+* Let ${\sf U}=\{\omega^0,\omega^1,\ldots,\omega^{m-1}\}$, where $\omega$ is a primitive $m$-th root of unity, and assume $m=2k=2h$.
+* Let $i$ denote the row index, $j$ denote the codeword-symbol index on each row, and $r$ denote the index on the half-size domain.
+* Define $x_r=(\omega^2)^r$ for $r\in[0,h-1]$.
+* For each row $w_i$, define $A_i(x_r)=w_{i,2r}$ and $B_i(x_r)=w_{i,2r+1}$.
+* Sample $p \leftarrow\mathsf{H}({\sf pp},\{h_i\}_{i \in [1, n]}, {\sf root})$ and set $q=p/\omega$.
+* Define $\ell_r(z)=\frac{z^h-1}{h}\cdot\frac{x_r}{z-x_r}$.
+* Compute the shared barycentric-check vector $L=(L_0,\ldots,L_{m-1})$, where $\forall r\in[0,h-1]:L_{2r}=\ell_r(p)$ and $L_{2r+1}=-\ell_r(q)$.
+
+#### Inner product inside the proof: 
+$$ \begin{aligned} \forall i\in[1,n]:\quad \langle L,w_i\rangle &= \sum_{j=0}^{m-1}L_jw_{i,j} = \sum_{r=0}^{h-1}L_{2r}w_{i,2r} +\sum_{r=0}^{h-1}L_{2r+1}w_{i,2r+1} \\ &= \sum_{r=0}^{h-1}\ell_r(p)w_{i,2r} -\sum_{r=0}^{h-1}\ell_r(q)w_{i,2r+1} = A_i(p)-B_i(q) \\ &= A_i(p)-B_i(p/\omega) = 0. \end{aligned} $$
+
+---
+
+## Parameter Spreadsheet
+
+The following table summarizes the flexible parameters in the current PQ-DAS V1 construction.
+
+| Parameter | Definition |
+|---|---|
+| $\lambda$ | Security parameter. |
+| $\mathsf{H}$ | Hash function and hash-to-field function. |
+| $\mathbb{F}$ | Finite field used for RS encoding and STARK arithmetic. |
+| ${\sf U}$ | RS evaluation domain. |
+| $m$ | Number of RS symbols in each encoded codeword. |
+| $k$ | Number of input symbols in each blob, satisfying $k=\rho m$. |
+| $\rho$ | RS code rate. |
+| $n$ | Number of blobs, equivalently the number of rows in the matrix. |
+| $c$ | Number of field elements grouped into one cell. |
+| $\ell$ | Number of cells on each row, where $\ell=m/c$. |
+| $t = \left\lceil \frac{k}{c} \right\rceil$ | Reconstruction threshold measured in cells. |
+| $\lvert Q\rvert$ | Number of queried columns/cells in one transcript. |
+| $L$ | Number of accepted transcripts used for reconstruction. |
+| Membership check type | Choice of RS membership check: parity-check, general barycentric check, or special barycentric check for $\rho=1/2$. |
+| $\mathsf{pp}_{\sf STARK}$ | Public parameters used by LeanVM. |
diff --git a/crates/lean_prover/src/prove_execution.rs b/crates/lean_prover/src/prove_execution.rs
index dc46d2df6..1aeeb05a4 100644
--- a/crates/lean_prover/src/prove_execution.rs
+++ b/crates/lean_prover/src/prove_execution.rs
@@ -225,8 +225,9 @@ pub fn prove_execution(
         committed_statements.get_mut(table).unwrap().push(claim);
     }
 
-    let public_memory_random_point = MultilinearPoint(prover_state.sample_vec(log2_strict_usize(PUBLIC_INPUT_LEN)));
-    let public_memory_eval = (&memory[..PUBLIC_INPUT_LEN]).evaluate(&public_memory_random_point);
+    let public_memory_len = bytecode.public_memory_len();
+    let public_memory_random_point = MultilinearPoint(prover_state.sample_vec(log2_strict_usize(public_memory_len)));
+    let public_memory_eval = (&memory[..public_memory_len]).evaluate(&public_memory_random_point);
 
     let previous_statements = vec![
         SparseStatement::new(
diff --git a/crates/lean_prover/src/verify_execution.rs b/crates/lean_prover/src/verify_execution.rs
index 3145edd36..870958f61 100644
--- a/crates/lean_prover/src/verify_execution.rs
+++ b/crates/lean_prover/src/verify_execution.rs
@@ -170,8 +170,10 @@ pub fn verify_execution(
         return Err(ProofError::InvalidProof);
     }
 
-    let public_memory_random_point = MultilinearPoint(verifier_state.sample_vec(log2_strict_usize(public_input.len())));
-    let public_memory_eval = public_input.evaluate(&public_memory_random_point);
+    let public_memory = bytecode.public_memory(public_input);
+    let public_memory_random_point =
+        MultilinearPoint(verifier_state.sample_vec(log2_strict_usize(public_memory.len())));
+    let public_memory_eval = public_memory.evaluate(&public_memory_random_point);
 
     let previous_statements = vec![
         SparseStatement::new(
diff --git a/crates/lean_vm/src/diagnostics/exec_result.rs b/crates/lean_vm/src/diagnostics/exec_result.rs
index 82ffed943..f1475d861 100644
--- a/crates/lean_vm/src/diagnostics/exec_result.rs
+++ b/crates/lean_vm/src/diagnostics/exec_result.rs
@@ -12,7 +12,7 @@ pub struct ExecutionMetadata {
     pub n_poseidons: usize,
     pub n_extension_ops: usize,
     pub bytecode_size: usize,
-    pub public_input_size: usize,
+    pub public_memory_size: usize,
     pub runtime_memory: usize,
     pub memory_usage_percent: f64,
     pub stdout: String,
@@ -44,8 +44,8 @@ impl ExecutionMetadata {
         out.push('\n');
         out.push_str(&format!("Bytecode size: {}\n", pretty_integer(self.bytecode_size)));
         out.push_str(&format!(
-            "Public input size: {}\n",
-            pretty_integer(self.public_input_size)
+            "Public memory size: {}\n",
+            pretty_integer(self.public_memory_size)
         ));
         out.push_str(&format!("Runtime memory: {}\n", pretty_integer(self.runtime_memory)));
         out.push_str(&format!("Memory usage: {:.1}%\n", self.memory_usage_percent));
diff --git a/crates/lean_vm/src/execution/runner.rs b/crates/lean_vm/src/execution/runner.rs
index 5e8e8f42c..e996dfef1 100644
--- a/crates/lean_vm/src/execution/runner.rs
+++ b/crates/lean_vm/src/execution/runner.rs
@@ -278,9 +278,12 @@ fn execute_bytecode_helper(
     let hint_data = &witness.hints;
     let mut hint_indices = vec![0usize; n_slots];
 
-    let public_memory = public_input.to_vec();
+    // Read-only bytecode data is part of public memory, so the proof binds every
+    // constant without spending VM instructions copying it into runtime memory.
+    let public_memory = bytecode.public_memory(public_input);
+    let public_memory_len = public_memory.len();
     let mut memory = Memory::new(public_memory);
-    let mut fp = PUBLIC_INPUT_LEN + witness.preamble_memory_len;
+    let mut fp = public_memory_len + witness.preamble_memory_len;
     fp = fp.next_multiple_of(DIMENSION);
     let initial_ap = fp + bytecode.starting_frame_memory();
     let mut pc = STARTING_PC;
@@ -360,7 +363,7 @@ fn execute_bytecode_helper(
     } else {
         None
     };
-    let runtime_memory_size = memory.0.len() - PUBLIC_INPUT_LEN - witness.preamble_memory_len;
+    let runtime_memory_size = memory.0.len() - public_memory_len - witness.preamble_memory_len;
     let used_memory_cells = parallel::map_reduce(
         memory.0.len(),
         || 0usize,
@@ -373,7 +376,7 @@ fn execute_bytecode_helper(
         n_poseidons: trace.tables[&Table::poseidon16()].columns[0].len(),
         n_extension_ops: trace.tables[&Table::extension_op()].columns[0].len(),
         bytecode_size: bytecode.size(),
-        public_input_size: PUBLIC_INPUT_LEN,
+        public_memory_size: public_memory_len,
         runtime_memory: runtime_memory_size,
         memory_usage_percent: used_memory_cells as f64 / memory.0.len() as f64 * 100.0,
         stdout: std::mem::take(std_out),
diff --git a/crates/lean_vm/src/isa/bytecode.rs b/crates/lean_vm/src/isa/bytecode.rs
index 5e734a6b4..5a6547a43 100644
--- a/crates/lean_vm/src/isa/bytecode.rs
+++ b/crates/lean_vm/src/isa/bytecode.rs
@@ -2,7 +2,7 @@
 
 use backend::*;
 
-use crate::{DIMENSION, F, FileId, FunctionName, Hint, N_INSTRUCTION_COLUMNS, SourceLocation};
+use crate::{DIMENSION, F, FileId, FunctionName, Hint, N_INSTRUCTION_COLUMNS, PUBLIC_INPUT_LEN, SourceLocation};
 
 use super::Instruction;
 use super::encoder::field_representation;
@@ -31,6 +31,8 @@ pub struct Bytecode {
     code: Vec<CodeEntry>,
     hint_name_to_index: BTreeMap<String, usize>,
     instructions_multilinear: Vec<F>,
+    /// Public read-only values loaded after the fixed public input before execution.
+    read_only_data: Vec<F>,
     starting_frame_memory: usize,
     ending_pc: usize, // always `code.len() - 1`
     hash: [F; DIGEST_ELEMS],
@@ -67,6 +69,7 @@ impl Bytecode {
             code,
             hint_name_to_index,
             instructions_multilinear,
+            read_only_data: Vec::new(),
             starting_frame_memory,
             ending_pc,
             hash,
@@ -90,6 +93,49 @@ impl Bytecode {
         &self.instructions_multilinear
     }
 
+    /// Returns the absolute memory address at which the read-only segment begins.
+    pub const fn read_only_data_start(&self) -> usize {
+        PUBLIC_INPUT_LEN
+    }
+
+    /// Returns the public read-only values bound to this bytecode.
+    pub fn read_only_data(&self) -> &[F] {
+        &self.read_only_data
+    }
+
+    /// Attaches public read-only data and binds it into the bytecode hash.
+    pub fn with_read_only_data(mut self, mut data: Vec<F>) -> Self {
+        data.resize(data.len().next_multiple_of(DIGEST_ELEMS), F::ZERO);
+        self.read_only_data = data;
+        self.refresh_hash();
+        self
+    }
+
+    /// Constructs the power-of-two public memory prefix checked by the proof.
+    pub fn public_memory(&self, public_input: &[F; PUBLIC_INPUT_LEN]) -> Vec<F> {
+        let mut memory = Vec::with_capacity(self.public_memory_len());
+        memory.extend_from_slice(public_input);
+        memory.extend_from_slice(&self.read_only_data);
+        memory.resize(self.public_memory_len(), F::ZERO);
+        memory
+    }
+
+    /// Returns the padded public-memory length without allocating the segment.
+    pub fn public_memory_len(&self) -> usize {
+        (PUBLIC_INPUT_LEN + self.read_only_data.len()).next_power_of_two()
+    }
+
+    /// Recomputes the Fiat-Shamir bytecode hash after read-only data changes.
+    fn refresh_hash(&mut self) {
+        let instruction_hash = poseidon_hash_slice(&self.instructions_multilinear);
+        self.hash = if self.read_only_data.is_empty() {
+            instruction_hash
+        } else {
+            let data_hash = poseidon_hash_slice(&self.read_only_data);
+            poseidon16_compress_pair(&instruction_hash, &data_hash)
+        };
+    }
+
     #[inline]
     pub fn starting_frame_memory(&self) -> usize {
         self.starting_frame_memory
diff --git a/crates/pq_das/Cargo.toml b/crates/pq_das/Cargo.toml
new file mode 100644
index 000000000..95274257e
--- /dev/null
+++ b/crates/pq_das/Cargo.toml
@@ -0,0 +1,14 @@
+[package]
+name = "pq_das"
+version.workspace = true
+edition.workspace = true
+
+[lints]
+workspace = true
+
+[dependencies]
+backend.workspace = true
+clap.workspace = true
+lean_compiler.workspace = true
+lean_prover.workspace = true
+lean_vm.workspace = true
diff --git a/crates/pq_das/README.md b/crates/pq_das/README.md
new file mode 100644
index 000000000..30b9252a6
--- /dev/null
+++ b/crates/pq_das/README.md
@@ -0,0 +1,251 @@
+# PQ-DAS V1 LeanVM Demo
+
+## Construction Algorithms and Code Mapping
+
+- **Setup**
+  - There is no standalone `setup()` function because this demo uses transparent LeanVM/WHIR parameters and compile-time profile constants.
+  - `ParameterProfile::{TINY, MEDIUM, LARGE, STRESS, BLOB_128K_1, BLOB_128K_4, BLOB_128K_16}` defines $n$, $m$, $k$, $c$, and the WHIR inverse-rate exponent.
+  - `ParameterProfile::validate()` checks the current implementation constraints: $m=2k$, power-of-two FFT and Merkle domains, $c\mid m$, KoalaBear two-adicity, and Poseidon rate alignment.
+  - `ParameterProfile::profile_block()` and `fs_block()` provide the public domain-separated field encodings used by Fiat-Shamir.
+  - `compilation_flags()` specializes the compact zkDSL program to the selected profile, while `default_whir_config()` selects the LeanVM proving parameters.
+
+- **Encoding / Commitment algorithm $\mathsf{Com}({\sf pp},{\sf data})$**
+  - `commit()` is the complete convenience entry point.
+  - `commit()` calls `encode_and_commit()` to construct the codewords and public commitment.
+    - `encode_and_commit()` calls `encode()`.
+    - `encode()` calls `encode_blob()` once for every input blob.
+    - `encode_blob()` calls `ifft()` on the size-$k$ systematic subgroup, zero-pads the coefficients to length $m$, and calls `fft()` on the size-$m$ domain.
+    - `encode_and_commit()` calls `row_hash()` for every row, `column_hash()` for every cell column, and `merkle_layers()` to construct the binary Poseidon tree and ${\sf root}$.
+  - `commit()` then calls `prepare_statement()`.
+    - `prepare_statement()` calls `check_vector()`.
+    - `check_vector()` calls `challenge()`, which calls `fiat_shamir_digest()`.
+    - `prepare_statement()` compiles the profile-specific guest once and attaches the public row hashes, ${\sf root}$, and $L$ vector as read-only LeanVM data.
+  - `commit()` finally calls `prove_codewords()`, which packages only the codeword matrix as the private witness and invokes LeanVM `prove_execution()`.
+
+- **First verifier / query algorithm $\mathsf{V}_1({\sf com})$**
+  - `ParameterProfile::sampling_count()` computes the minimum distinct-cell sample count matching LeanVM's current $124$-bit WHIR security target.
+  - `sample_query_indices()` expands external randomness and the commitment ${\sf root}$ through Poseidon, uses rejection sampling to avoid modulo bias, and derives that many distinct cell-column indices.
+  - `query()` receives one or more selected indices, extracts $W_{1,j},\ldots,W_{n,j}$ from the codeword matrix, and obtains the sibling digest at every Merkle level from the stored `merkle_layers`.
+  - `query()` returns a `Transcript` containing each queried column of cells and its authentication path.
+
+- **Second verifier algorithm $\mathsf{V}_2({\sf com},{\sf tran})$**
+  - `verify()` is the combined entry point.
+  - `verify()` first calls `verify_execution_proof()` with only the public `Commitment` and proof.
+    - `verify_execution_proof()` calls `prepare_statement()` independently on the verifier side.
+    - `prepare_statement()` recomputes Fiat–Shamir, $p$, $q$, every $L_j$, and the read-only public-data segment from the public profile, row hashes, and ${\sf root}$.
+    - It then invokes LeanVM `verify_execution()` with the independently reconstructed bytecode.
+  - `verify()` then calls `verify_openings()`.
+    - `verify_openings()` recomputes each queried `column_hash`.
+    - It folds the authentication path with `poseidon16_compress_pair()`.
+    - It accepts only if every reconstructed root equals the public commitment ${\sf root}$.
+
+- **Reconstruction algorithm $\mathsf{Ext}({\sf com},{\sf tran}_1,\ldots,{\sf tran}_L)$**
+  - `reconstruct()` calls `verify_openings()` on every supplied transcript.
+  - It unions openings by cell index and requires at least $\lceil k/c\rceil$ distinct cells.
+  - It expands the accepted cells into arbitrary indexed RS evaluations and constructs one shared `ErasureDecoder` for their erasure pattern.
+  - `ErasureDecoder::new()` constructs the erasure locator polynomial $Z(X)$ with an FFT subproduct tree, evaluates $Z$ over the size-$m$ domain, and prepares the reversed-polynomial inverse used for exact division.
+  - For each row, `ErasureDecoder::reconstruct_blob()` forms the complete evaluation vector of $N(X)=f(X)Z(X)$, applies a size-$m$ IFFT, divides $N(X)$ by $Z(X)$ using the prepared Newton inverse, and applies a size-$k$ FFT to recover the systematic even-position values.
+  - The locator preparation is shared by every row. It costs $O(M(m)\log m)$ for arbitrary erasures, where $M(m)$ is polynomial-multiplication cost; with the current radix-2 FFT multiplier this is $O(m\log^2m)$. Each row then costs $O(m\log m)$.
+
+## Current Demo Instantiation
+
+- **Blob definition:** `Blob = Vec<F>`, where `F` is the KoalaBear base-field type. One blob is exactly $k$ canonical KoalaBear elements, and `Data = Vec<Blob>` contains $n$ blobs. The 128 KiB profiles use $k=32768$, which occupies 128 KiB when each canonical field element is serialized in four bytes. This is a field-native format, not yet an injective packing of an arbitrary 128 KiB byte string.
+
+- **Fields used by the demo:**
+  - **KoalaBear base field $\mathbb{F}$:** represented by `lean_vm::F`. Blob symbols, RS coefficients and evaluations, FFT arithmetic, Poseidon inputs and digests, Merkle nodes, VM memory, and the five coordinates used to serialize extension elements all live in $\mathbb{F}$.
+  - **Quintic extension field $\mathbb{E}=\mathbb{F}[X]/(f(X))$:** represented by `lean_vm::EF`. It is used for the Fiat–Shamir challenge $p$, the derived point $q=p/\omega$, the coefficients $L_j$, and the RS membership inner product. Since $[\mathbb{E}:\mathbb{F}]=5$, one extension element is represented by five KoalaBear coordinates.
+  - **WHIR/STARK arithmetic:** LeanVM's execution proof is ultimately arithmetized over the same KoalaBear base field, while extension-field values used by the proof system and `dot_product_be()` are represented through their five base-field coordinates. The demo does not introduce a third application-level RS field.
+
+- **RS encoding and systematic data:** the evaluation domain is
+  $\mathsf{U}=\{1,\omega,\ldots,\omega^{m-1}\}$ over KoalaBear. The current special-barycentric implementation requires $m=2k$. `encode_blob()` interprets the input blob as evaluations on the size-$k$ subgroup generated by $\omega^2$, applies a size-$k$ IFFT, zero-pads the coefficient vector, and applies a size-$m$ FFT. Therefore the original blob values appear at the even codeword positions:
+  $$
+  w_{i\cdot(m/k)}=w_{2i}={\sf blob}_i.
+  $$
+
+- **Cell definition:** one row is divided into $\ell=m/c$ cells. Cell $W_{i,j}$ contains the $c$ consecutive values
+  $$
+  W_{i,j}=(w_{i,jc},\ldots,w_{i,(j+1)c-1}).
+  $$
+  A queried cell column contains the cells with the same index from all $n$ rows. The reconstruction threshold is $t=\lceil k/c\rceil$ distinct cell columns because they contain at least $k$ codeword evaluations per row.
+
+- **Arbitrary-cell reconstruction:** reconstruction accepts any verified set of at least $t$ distinct cells; the selected cells do not need to contain the systematic even positions or form an FFT subgroup. Let $S\subseteq[0,m-1]$ be the available symbol indices and let
+  $$
+  Z(X)=\prod_{j\notin S}(X-\omega^j)
+  $$
+  be the erasure locator. For each row polynomial $f(X)$, the decoder constructs the complete size-$m$ evaluation vector of
+  $$
+  N(X)=f(X)Z(X).
+  $$
+  At every available position $j\in S$, it sets $N(\omega^j)=w_jZ(\omega^j)$; at every erased position, $Z(\omega^j)=0$, so $N(\omega^j)=0$ is known without knowing $w_j$. A size-$m$ IFFT recovers the coefficients of $N$, fast exact division by $Z$ recovers the coefficients of $f$, and a size-$k$ FFT over the subgroup generated by $\omega^2$ returns
+  $$
+  \bigl(f(1),f(\omega^2),\ldots,f(\omega^{2(k-1)})\bigr),
+  $$
+  which is exactly the original systematic blob. `ErasureDecoder::new()` prepares $Z$, its domain evaluations, and the Newton division inverse once for the shared cell pattern; `ErasureDecoder::reconstruct_blob()` reuses them for every row.
+
+- **Row hash:** `row_hash()` gathers the $k$ systematic positions $w_{i,2r}$ and computes
+  $$
+  h_i=\mathsf{PoseidonHash}(w_{i,0},w_{i,2},\ldots,w_{i,2k-2}).
+  $$
+  The output $h_i$ is an eight-KoalaBear-element digest.
+
+- **Column hash and Merkle tree:** `column_hash()` computes
+  $$
+  d_j=\mathsf{PoseidonHash}(W_{1,j}\parallel\cdots\parallel W_{n,j})
+  $$
+  over $n\cdot c$ KoalaBear elements using the Poseidon width-16, rate-8 sponge. These $\ell$ digests are the leaves of a complete binary tree; every internal node is `poseidon16_compress_pair(left, right)`, and the final digest is the public ${\sf root}$.
+
+- **LeanVM public statement and witness:** on the prover side, `PreparedStatement` contains the public `Commitment`, the derived RS check vector $L$, and the compiled bytecode. The prover sends the public `Commitment` and proof, but does not send $L$ as a trusted verifier input. The verifier independently reconstructs an equivalent `PreparedStatement`. In both instances, the row hashes, ${\sf root}$, and derived $L$ are stored in a read-only public-data segment bound by both the bytecode hash and the public-memory polynomial. The private `ExecutionWitness` contains only the flattened $n\times m$ codeword matrix.
+
+- **LeanVM proof relation:** the zkDSL `main()` proves exactly three classes of constraints:
+  1. for every row, recompute the systematic Poseidon hash and equate it to the public $h_i$;
+  2. recompute every column hash and the complete binary Poseidon tree, then equate the result to the public ${\sf root}$;
+  3. for every row, execute `dot_product_be(w_i, L)` and constrain all five quintic-extension coordinates of the result to zero:
+     $$
+     \langle w_i,L\rangle=\sum_{j=0}^{m-1}w_{i,j}L_j=0\in\mathbb{E}.
+     $$
+
+- **Fiat–Shamir and $L$ computation outside the proof:** `fiat_shamir_digest()` hashes the RS domain separator, encoded profile, all row hashes, and the Merkle ${\sf root}$. The first five digest coordinates define $p\in\mathbb{E}$, and $q=p/\omega$. For $x_r=(\omega^2)^r$, the implementation computes
+  $$
+  L_{2r}=\frac{p^k-1}{k}\cdot\frac{x_r}{p-x_r},
+  \qquad
+  L_{2r+1}=-\frac{q^k-1}{k}\cdot\frac{x_r}{q-x_r}.
+  $$
+  Montgomery batch inversion replaces the $2k$ individual extension-field inversions with one inversion and $O(k)$ multiplications. For nonzero denominators $a_1,\ldots,a_N$, it computes prefix products $P_i=\prod_{j=1}^{i}a_j$, inverts only $P_N$, and walks backward using
+  $$
+  a_i^{-1}=P_{i-1}\cdot P_N^{-1}\cdot\prod_{j=i+1}^{N}a_j,
+  $$
+  updating the suffix inverse at each step. Thus every $a_i^{-1}$ is recovered with multiplications after one inversion. The resulting $L$ is generated once during each party's statement preparation and reused throughout that party's proof or verification workflow.
+  The verifier is required to call `prepare_statement(commitment)` independently. Consequently, it recomputes the Fiat–Shamir digest, $p$, $q$, all denominator inverses, and every $L_j$ from the public profile, row hashes, and ${\sf root}$ before LeanVM proof verification. A prover-supplied incorrect challenge or $L$ therefore produces bytecode that differs from the verifier's bytecode and cannot verify.
+
+- **RS membership computation inside the proof:** for each row, LeanVM's extension-field precompile computes
+  $$
+  \sum_{j=0}^{m-1}w_{i,j}L_j.
+  $$
+  Since $L_j$ has five KoalaBear coordinates, `dot_product_be()` returns one quintic-extension element. `assert_ext_zero()` constrains all five coordinates to zero. No Fiat–Shamir hashing, challenge derivation, denominator inversion, or $L$ construction occurs inside the proof.
+
+## Benchmark Parameter Sets
+
+| Dataset | Base field $\mathbb{F}$ | Challenge field $\mathbb{E}$ | Hash | $\rho$ | Membership | WHIR $\log(1/{\rm rate})$ | $n$ | $m$ | $k$ | $c$ | $\ell=m/c$ | $t=k/c$ | Input size | Encoded size |
+| --- | --- | --- | --- | ---: | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
+| `tiny` | KoalaBear | Quintic extension | Poseidon-16/8 | $1/2$ | Special barycentric | 1 | 2 | 16 | 8 | 8 | 2 | 1 | 64 B | 128 B |
+| `medium` | KoalaBear | Quintic extension | Poseidon-16/8 | $1/2$ | Special barycentric | 1 | 8 | 256 | 128 | 8 | 32 | 16 | 4 KiB | 8 KiB |
+| `large` | KoalaBear | Quintic extension | Poseidon-16/8 | $1/2$ | Special barycentric | 1 | 16 | 1,024 | 512 | 8 | 128 | 64 | 32 KiB | 64 KiB |
+| `stress` | KoalaBear | Quintic extension | Poseidon-16/8 | $1/2$ | Special barycentric | 1 | 32 | 4,096 | 2,048 | 8 | 512 | 256 | 256 KiB | 512 KiB |
+| `blob-128k-1` | KoalaBear | Quintic extension | Poseidon-16/8 | $1/2$ | Special barycentric | 1 | 1 | 65,536 | 32,768 | 64 | 1,024 | 512 | 128 KiB | 256 KiB |
+| `blob-128k-4` | KoalaBear | Quintic extension | Poseidon-16/8 | $1/2$ | Special barycentric | 1 | 4 | 65,536 | 32,768 | 64 | 1,024 | 512 | 512 KiB | 1 MiB |
+
+Sizes use the four-byte canonical serialization of one KoalaBear element.
+
+## Benchmark Results
+
+All six proofs were accepted and all six profiles completed reconstruction from an independently sampled arbitrary set of exactly $t$ cells. LeanVM currently configures WHIR for $124$-bit security, so sampling uses the minimum number of distinct cells satisfying the matching worst-case availability bound:
+$$
+\Pr[\mathsf{miss}]
+=\frac{\binom{t-1}{q}}{\binom{\ell}{q}}
+\le 2^{-124},
+$$
+where an unreconstructable encoding has at most $t-1$ available cell columns. For the two smallest domains, $124$-bit soundness requires $q=t$, making the bound zero. The larger profiles achieve the target with $q<t$. Raising only the sampling target above $124$ bits would not raise the end-to-end demo security while WHIR remains configured for $124$ bits.
+
+| Dataset | Bytecode instructions | Read-only elements | Opened cells $q$ | Commitment size (KB) | Proof size (KB) | Sample size (KB) | Encode + commit | Prover preprocess | LeanVM prove | Verifier rebuild + LeanVM verify | Verify openings | Reconstruct | Result |
+| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | --- |
+| `tiny` | 512 | 104 | 1 | 0.094 | 187.723 | 0.098 | 0.000s | 0.005s | 0.100s | 0.028s | 0.000s | 0.000s | Correct |
+| `medium` | 512 | 1,352 | 16 | 0.281 | 260.516 | 6.562 | 0.001s | 0.006s | 0.200s | 0.035s | 0.000s | 0.000s | Correct |
+| `large` | 1,024 | 5,256 | 63 | 0.531 | 308.270 | 45.527 | 0.006s | 0.015s | 0.915s | 0.045s | 0.002s | 0.003s | Correct |
+| `stress` | 1,024 | 20,744 | 105 | 1.031 | 376.602 | 134.941 | 0.043s | 0.016s | 6.812s | 0.055s | 0.006s | 0.025s | Correct |
+| `blob-128k-1` | 1,024 | 327,696 | 114 | 0.062 | 357.512 | 64.570 | 0.023s | 0.077s | 3.600s | 0.130s | 0.003s | 0.067s | Correct |
+| `blob-128k-4` | 1,024 | 327,720 | 114 | 0.156 | 398.496 | 150.070 | 0.097s | 0.079s | 17.400s | 0.209s | 0.008s | 0.143s | Correct |
+
+Size columns use $1\,\mathrm{KB}=1024$ bytes. The benchmark columns measure the following:
+
+- **Bytecode instructions:** the number of LeanVM instructions in the compiled, profile-specific zkDSL program after padding to the bytecode domain. Public row hashes, ${\sf root}$, and $L$ are not expanded into assignment instructions, so increasing $m$ mainly enlarges the read-only segment rather than the instruction count.
+
+- **Read-only elements:** the number of KoalaBear elements stored in LeanVM's statement-bound read-only segment. It contains all $n$ row-hash digests, the Merkle ${\sf root}$, and the $m$ extension-field coefficients of $L$, each represented by five base-field coordinates:
+  $$
+  8n+8+5m.
+  $$
+
+- **Opened cells $q$:** the minimum number of distinct random cell columns needed to make the worst-case availability false-accept probability at most $2^{-124}$, matching LeanVM's current WHIR setting. This is a DAS sampling count, not the reconstruction threshold $t$. Each opening contains the corresponding cell from every row and a Merkle authentication path.
+
+- **Commitment size:** the size in KB of the canonical four-byte field serialization of the construction commitment $\{h_i\}_{i=1}^{n}$ and ${\sf root}$:
+  $$
+  (n+1)\cdot 8\cdot 4\ \text{bytes}.
+  $$
+  The profile ${\sf pp}$ is assumed to be agreed public metadata and is not counted again.
+
+- **Proof size:** `proof_size_fe()` reported by the LeanVM/WHIR proof, multiplied by four bytes per canonically serialized KoalaBear element and converted to KB. This measures the cryptographic execution proof only; it excludes the commitment and sampled cell openings.
+
+- **Sample size:** the canonical wire size in KB of the benchmark `Transcript`. For each of the $q$ openings it counts one four-byte cell index, $n\cdot c$ four-byte field elements, and $\log_2\ell$ sibling digests of eight field elements:
+  $$
+  q\left(4+4nc+32\log_2\ell\right)\ \text{bytes}.
+  $$
+
+- **Encode + commit:** host time for `encode_and_commit()`. It includes RS encoding of all $n$ blobs, computation of every row hash, computation of all column hashes, and construction of the complete Poseidon Merkle tree. It does not include Fiat–Shamir, $L$ generation, bytecode compilation, or LeanVM proving.
+
+- **Prover preprocess:** host time for the prover's `prepare_statement()`. Starting from the public `Commitment`, it computes the Fiat–Shamir digest, derives $p$ and $q$, constructs $L$ using batch inversion, compiles the compact profile-specific zkDSL program, and attaches the row hashes, ${\sf root}$, and $L$ as read-only public data. It does not execute or prove the LeanVM program.
+
+- **LeanVM prove:** time spent in `prove_codewords()` after encoding, commitment construction, Fiat–Shamir, $L$ generation, and bytecode compilation have finished. It includes witness packaging, execution of the zkDSL program, execution-trace construction, and generation of the LeanVM/WHIR proof for the row-hash, Merkle-root, and RS inner-product relations.
+
+- **Verifier rebuild + LeanVM verify:** total verifier time for `verify_execution_proof()`. The verifier first runs its own `prepare_statement(commitment)`, independently recomputing Fiat–Shamir, $p$, $q$, all $L_j$, the read-only segment, and the expected bytecode. It then invokes LeanVM `verify_execution()` on the proof. This column therefore includes both public $L$ validation by reconstruction and cryptographic proof verification, but excludes Merkle opening verification.
+
+- **Verify openings:** host time for `verify_openings()`. It recomputes the Poseidon column digest for every opened cell column and verifies every Merkle authentication path against the public ${\sf root}$. It does not include LeanVM proof verification or RS reconstruction.
+
+- **Reconstruct:** total time for `reconstruct()`, including re-verification of the $t$ reconstruction openings, one shared arbitrary-erasure locator preparation, and recovery of all $n$ rows through numerator IFFT, fast exact division, and systematic-domain FFT.
+
+- **Result:** whether the reconstructed blobs exactly equal the original input blobs.
+
+## Implemented Engineering Optimizations
+
+- The prover's `PreparedStatement` caches the commitment, its single generated $L$ vector, and compiled bytecode for proving.
+- The verifier accepts only the public `Commitment` and proof, then independently recomputes Fiat–Shamir, $p$, $q$, $L$, and the bound bytecode before proof verification.
+- LeanVM has a read-only public-data segment for row hashes, ${\sf root}$, and $L$. It is immutable, included in the bytecode hash, and constrained by the public-memory polynomial.
+- Public constants are no longer expanded into hundreds of thousands of VM assignment instructions.
+- Within each party, $L$ is computed once and reused instead of being regenerated per row. The verifier's copy is independently derived rather than received from the prover.
+- Availability sampling derives distinct Poseidon-based indices and computes the minimum count for a $124$-bit worst-case hypergeometric bound matching LeanVM instead of opening the reconstruction threshold by default.
+- Montgomery batch inversion reduces $L$ generation from $2k$ extension-field inversions to one inversion plus linear-many multiplications.
+- Poseidon sponge absorption uses runtime loops, keeping bytecode size nearly constant as $k$ and $m$ grow.
+- RS encoding uses radix-2 IFFT/FFT rather than quadratic Lagrange evaluation.
+- Arbitrary-cell reconstruction replaces quadratic barycentric interpolation with a shared FFT subproduct-tree erasure locator, Newton exact division, and FFT recovery. Locator preparation is reused across every row.
+- RS membership inside LeanVM uses the optimized `dot_product_be` extension-field precompile.
+- Benchmark timing is separated into encoding/commitment, prover preprocessing, proving, verifier statement reconstruction plus proof verification, and opening verification. Commitment, proof, and sample sizes are reported separately.
+
+## Running Commands
+
+Run correctness tests:
+
+```bash
+cargo test --release -p pq_das -- --nocapture
+```
+
+Run the six benchmark datasets:
+
+```bash
+cargo run --release -p pq_das -- --profile tiny
+cargo run --release -p pq_das -- --profile medium
+cargo run --release -p pq_das -- --profile large
+cargo run --release -p pq_das -- --profile stress
+cargo run --release -p pq_das -- --profile blob-128k-1
+cargo run --release -p pq_das -- --profile blob-128k-4
+```
+
+Build once and run all six benchmarks:
+
+```bash
+cargo build --release -p pq_das && \
+  for p in tiny medium large stress blob-128k-1 blob-128k-4; do
+    target/release/pq_das --profile "$p"
+  done
+```
+
+Run a custom half-rate profile:
+
+```bash
+cargo run --release -p pq_das -- \
+  --profile custom --n 4 --m 128 --k 64 --c 8
+```
+
+Override the WHIR inverse-rate exponent:
+
+```bash
+cargo run --release -p pq_das -- \
+  --profile medium --whir-log-inv-rate 2
+```
diff --git a/crates/pq_das/src/config.rs b/crates/pq_das/src/config.rs
new file mode 100644
index 000000000..2c1688d67
--- /dev/null
+++ b/crates/pq_das/src/config.rs
@@ -0,0 +1,225 @@
+use std::fmt::{Display, Formatter};
+
+use backend::PrimeCharacteristicRing;
+use lean_vm::F;
+
+pub const DIGEST_LEN: usize = 8;
+pub const EXT_DEGREE: usize = 5;
+/// Matches LeanVM's current default WHIR security target.
+pub const SAMPLING_SOUNDNESS_BITS: usize = lean_prover::SECURITY_BITS;
+
+#[derive(Clone, Copy, Debug, PartialEq, Eq)]
+pub struct ParameterProfile {
+    pub name: &'static str,
+    pub n: usize,
+    pub m: usize,
+    pub k: usize,
+    pub c: usize,
+    pub whir_log_inv_rate: usize,
+}
+
+#[derive(Clone, Debug, PartialEq, Eq)]
+pub struct ProfileError(pub String);
+
+impl Display for ProfileError {
+    /// Formats a parameter-validation failure.
+    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
+        f.write_str(&self.0)
+    }
+}
+
+impl std::error::Error for ProfileError {}
+
+impl ParameterProfile {
+    pub const TINY: Self = Self {
+        name: "tiny",
+        n: 2,
+        m: 16,
+        k: 8,
+        c: 8,
+        whir_log_inv_rate: 1,
+    };
+
+    pub const MEDIUM: Self = Self {
+        name: "medium",
+        n: 8,
+        m: 256,
+        k: 128,
+        c: 8,
+        whir_log_inv_rate: 1,
+    };
+
+    pub const LARGE: Self = Self {
+        name: "large",
+        n: 16,
+        m: 1024,
+        k: 512,
+        c: 8,
+        whir_log_inv_rate: 1,
+    };
+
+    pub const STRESS: Self = Self {
+        name: "stress",
+        n: 32,
+        m: 4096,
+        k: 2048,
+        c: 8,
+        whir_log_inv_rate: 1,
+    };
+
+    pub const BLOB_128K_1: Self = Self {
+        name: "blob-128k-1",
+        n: 1,
+        m: 65536,
+        k: 32768,
+        c: 64,
+        whir_log_inv_rate: 1,
+    };
+
+    pub const BLOB_128K_4: Self = Self {
+        name: "blob-128k-4",
+        n: 4,
+        m: 65536,
+        k: 32768,
+        c: 64,
+        whir_log_inv_rate: 1,
+    };
+
+    pub const BLOB_128K_16: Self = Self {
+        name: "blob-128k-16",
+        n: 16,
+        m: 65536,
+        k: 32768,
+        c: 64,
+        whir_log_inv_rate: 1,
+    };
+
+    /// Constructs a custom half-rate profile and validates it before use.
+    pub fn custom(n: usize, m: usize, k: usize, c: usize, whir_log_inv_rate: usize) -> Result<Self, ProfileError> {
+        let profile = Self {
+            name: "custom",
+            n,
+            m,
+            k,
+            c,
+            whir_log_inv_rate,
+        };
+        profile.validate()?;
+        Ok(profile)
+    }
+
+    /// Returns the number of cell columns in each encoded row.
+    pub const fn n_cells(self) -> usize {
+        self.m / self.c
+    }
+
+    /// Returns the number of distinct cells required for reconstruction.
+    pub const fn reconstruction_threshold_cells(self) -> usize {
+        self.k.div_ceil(self.c)
+    }
+
+    /// Returns the minimum number of distinct samples needed for the requested availability soundness.
+    pub fn sampling_count(self, soundness_bits: usize) -> usize {
+        let n_cells = self.n_cells();
+        let max_available_without_reconstruction = self.reconstruction_threshold_cells() - 1;
+        let mut log2_failure = 0.0;
+
+        for sample_count in 1..=n_cells {
+            if sample_count > max_available_without_reconstruction {
+                return sample_count;
+            }
+            let offset = sample_count - 1;
+            log2_failure +=
+                ((max_available_without_reconstruction - offset) as f64).log2() - ((n_cells - offset) as f64).log2();
+            if log2_failure <= -(soundness_bits as f64) {
+                return sample_count;
+            }
+        }
+        n_cells
+    }
+
+    /// Returns the log2 false-accept probability for distinct sampling in the worst unreconstructable case.
+    pub fn sampling_log2_failure(self, sample_count: usize) -> f64 {
+        let n_cells = self.n_cells();
+        let max_available_without_reconstruction = self.reconstruction_threshold_cells() - 1;
+        if sample_count > max_available_without_reconstruction {
+            return f64::NEG_INFINITY;
+        }
+        (0..sample_count)
+            .map(|offset| {
+                ((max_available_without_reconstruction - offset) as f64).log2() - ((n_cells - offset) as f64).log2()
+            })
+            .sum()
+    }
+
+    /// Returns the spacing between systematic subgroup evaluations.
+    pub const fn systematic_stride(self) -> usize {
+        self.m / self.k
+    }
+
+    /// Returns the binary Merkle-tree depth.
+    pub const fn merkle_depth(self) -> usize {
+        self.n_cells().ilog2() as usize
+    }
+
+    /// Checks all field, FFT, hashing, Merkle, and current membership constraints.
+    pub fn validate(self) -> Result<(), ProfileError> {
+        if self.n == 0 || self.k == 0 || self.m == 0 || self.c == 0 {
+            return Err(ProfileError("n, m, k, and c must be non-zero".to_string()));
+        }
+        if !self.m.is_power_of_two() || !self.k.is_power_of_two() {
+            return Err(ProfileError("m and k must be powers of two".to_string()));
+        }
+        if self.m != 2 * self.k {
+            return Err(ProfileError(
+                "the current special-barycentric implementation requires m = 2k".to_string(),
+            ));
+        }
+        if !self.m.is_multiple_of(self.c) {
+            return Err(ProfileError("c must divide m".to_string()));
+        }
+        if !self.n_cells().is_power_of_two() {
+            return Err(ProfileError("m/c must be a power of two".to_string()));
+        }
+        if !self.k.is_multiple_of(DIGEST_LEN) || !(self.n * self.c).is_multiple_of(DIGEST_LEN) {
+            return Err(ProfileError(
+                "k and n*c must be multiples of the Poseidon rate 8".to_string(),
+            ));
+        }
+        if !(1..=4).contains(&self.whir_log_inv_rate) {
+            return Err(ProfileError("WHIR log inverse rate must be in 1..=4".to_string()));
+        }
+        if self.m.ilog2() > 24 {
+            return Err(ProfileError("m exceeds KoalaBear two-adicity".to_string()));
+        }
+        Ok(())
+    }
+
+    /// Encodes this profile into one field-element block for public Fiat-Shamir preprocessing.
+    pub fn profile_block(self) -> [F; DIGEST_LEN] {
+        [
+            F::from_u32(0x5051_4441),
+            F::ONE,
+            F::from_usize(self.n),
+            F::from_usize(self.m),
+            F::from_usize(self.k),
+            F::from_usize(self.c),
+            F::from_usize(self.n_cells()),
+            F::from_usize(self.whir_log_inv_rate),
+        ]
+    }
+}
+
+/// Encodes the RS Fiat-Shamir domain separator into one Poseidon-rate block.
+pub fn fs_block() -> [F; DIGEST_LEN] {
+    [
+        F::from_u32(0x5253_4348),
+        F::ONE,
+        F::ZERO,
+        F::ZERO,
+        F::ZERO,
+        F::ZERO,
+        F::ZERO,
+        F::ZERO,
+    ]
+}
diff --git a/crates/pq_das/src/encoding.rs b/crates/pq_das/src/encoding.rs
new file mode 100644
index 000000000..9ba78a8f3
--- /dev/null
+++ b/crates/pq_das/src/encoding.rs
@@ -0,0 +1,283 @@
+use backend::{Field, PrimeCharacteristicRing, TwoAdicField};
+use lean_vm::F;
+
+use crate::config::ParameterProfile;
+
+pub type Blob = Vec<F>;
+pub type Codeword = Vec<F>;
+pub type Data = Vec<Blob>;
+pub type Codewords = Vec<Codeword>;
+
+/// Applies an in-place radix-2 FFT using the supplied root of unity.
+fn fft(values: &mut [F], root: F) {
+    assert!(values.len().is_power_of_two());
+    let n = values.len();
+    for i in 1..n {
+        let j = i.reverse_bits() >> (usize::BITS - n.ilog2());
+        if i < j {
+            values.swap(i, j);
+        }
+    }
+    let mut len = 2;
+    while len <= n {
+        let step = root.exp_u64((n / len) as u64);
+        for chunk in values.chunks_exact_mut(len) {
+            let mut twiddle = F::ONE;
+            let (left, right) = chunk.split_at_mut(len / 2);
+            for (a, b) in left.iter_mut().zip(right) {
+                let odd = *b * twiddle;
+                let even = *a;
+                *a = even + odd;
+                *b = even - odd;
+                twiddle *= step;
+            }
+        }
+        len *= 2;
+    }
+}
+
+/// Applies an inverse radix-2 FFT and normalizes the resulting coefficients.
+fn ifft(values: &mut [F], root: F) {
+    fft(values, root.inverse());
+    let n_inv = F::from_usize(values.len()).inverse();
+    for value in values {
+        *value *= n_inv;
+    }
+}
+
+/// Multiplies two coefficient-form polynomials using a radix-2 FFT convolution.
+fn multiply_polynomials(left: &[F], right: &[F]) -> Vec<F> {
+    if left.is_empty() || right.is_empty() {
+        return Vec::new();
+    }
+    let output_len = left.len() + right.len() - 1;
+    if left.len().min(right.len()) <= 16 {
+        let mut output = vec![F::ZERO; output_len];
+        for (i, &a) in left.iter().enumerate() {
+            for (j, &b) in right.iter().enumerate() {
+                output[i + j] += a * b;
+            }
+        }
+        return output;
+    }
+
+    let fft_len = output_len.next_power_of_two();
+    let root = F::two_adic_generator(fft_len.ilog2() as usize);
+    let mut left_evals = vec![F::ZERO; fft_len];
+    let mut right_evals = vec![F::ZERO; fft_len];
+    left_evals[..left.len()].copy_from_slice(left);
+    right_evals[..right.len()].copy_from_slice(right);
+    fft(&mut left_evals, root);
+    fft(&mut right_evals, root);
+    for (left, right) in left_evals.iter_mut().zip(right_evals) {
+        *left *= right;
+    }
+    ifft(&mut left_evals, root);
+    left_evals.truncate(output_len);
+    left_evals
+}
+
+/// Builds the monic polynomial whose roots are the supplied field elements.
+fn root_polynomial(roots: &[F]) -> Vec<F> {
+    const CHUNK_ROOTS: usize = 16;
+
+    let mut level: Vec<Vec<F>> = roots
+        .chunks(CHUNK_ROOTS)
+        .map(|chunk| {
+            let mut polynomial = vec![F::ONE];
+            for &root in chunk {
+                let mut next = vec![F::ZERO; polynomial.len() + 1];
+                for (degree, &coefficient) in polynomial.iter().enumerate() {
+                    next[degree] -= coefficient * root;
+                    next[degree + 1] += coefficient;
+                }
+                polynomial = next;
+            }
+            polynomial
+        })
+        .collect();
+    if level.is_empty() {
+        return vec![F::ONE];
+    }
+
+    while level.len() > 1 {
+        let mut next = Vec::with_capacity(level.len().div_ceil(2));
+        let mut pairs = level.chunks_exact(2);
+        for pair in &mut pairs {
+            next.push(multiply_polynomials(&pair[0], &pair[1]));
+        }
+        if let Some(last) = pairs.remainder().first() {
+            next.push(last.clone());
+        }
+        level = next;
+    }
+    level.pop().unwrap()
+}
+
+/// Computes a truncated formal power-series inverse with Newton iteration.
+fn invert_series(polynomial: &[F], target_len: usize) -> Vec<F> {
+    debug_assert!(!polynomial.is_empty() && polynomial[0] != F::ZERO);
+    let mut inverse = vec![polynomial[0].inverse()];
+    while inverse.len() < target_len {
+        let next_len = (2 * inverse.len()).min(target_len);
+        let product = multiply_polynomials(&polynomial[..polynomial.len().min(next_len)], &inverse);
+        let mut correction = vec![F::ZERO; next_len];
+        correction[0] = F::TWO;
+        for (output, value) in correction.iter_mut().zip(product) {
+            *output -= value;
+        }
+        inverse = multiply_polynomials(&inverse, &correction);
+        inverse.truncate(next_len);
+    }
+    inverse
+}
+
+/// Systematically RS-encodes one blob on a subgroup using a k-IFFT and m-FFT.
+pub fn encode_blob(profile: ParameterProfile, blob: &[F]) -> Codeword {
+    assert_eq!(blob.len(), profile.k);
+    let omega = F::two_adic_generator(profile.m.ilog2() as usize);
+    let systematic_root = omega.exp_u64(profile.systematic_stride() as u64);
+    let mut coefficients = blob.to_vec();
+    ifft(&mut coefficients, systematic_root);
+    coefficients.resize(profile.m, F::ZERO);
+    fft(&mut coefficients, omega);
+    coefficients
+}
+
+/// RS-encodes all blobs into the rows of the PQ-DAS codeword matrix.
+pub fn encode(profile: ParameterProfile, data: &Data) -> Codewords {
+    assert_eq!(data.len(), profile.n);
+    data.iter().map(|blob| encode_blob(profile, blob)).collect()
+}
+
+/// Reuses locator data to recover many rows with the same arbitrary erasure pattern.
+#[derive(Debug)]
+pub struct ErasureDecoder {
+    profile: ParameterProfile,
+    known_indices: Vec<usize>,
+    locator_evaluations: Vec<F>,
+    reversed_locator_inverse: Vec<F>,
+    numerator_max_degree: usize,
+}
+
+impl ErasureDecoder {
+    /// Precomputes the arbitrary-erasure locator, its domain evaluations, and its reversed inverse.
+    pub fn new(profile: ParameterProfile, known_indices: &[usize]) -> Option<Self> {
+        if known_indices.len() < profile.k {
+            return None;
+        }
+        let mut known = vec![false; profile.m];
+        for &index in known_indices {
+            if index >= profile.m || std::mem::replace(&mut known[index], true) {
+                return None;
+            }
+        }
+
+        let omega = F::two_adic_generator(profile.m.ilog2() as usize);
+        let mut point = F::ONE;
+        let mut erased_points = Vec::with_capacity(profile.m - known_indices.len());
+        for is_known in &known {
+            if !is_known {
+                erased_points.push(point);
+            }
+            point *= omega;
+        }
+
+        // Z(X) vanishes exactly at missing codeword positions. The subproduct
+        // tree uses FFT polynomial multiplication for quasi-linear preparation.
+        let locator = root_polynomial(&erased_points);
+        let mut locator_evaluations = vec![F::ZERO; profile.m];
+        locator_evaluations[..locator.len()].copy_from_slice(&locator);
+        fft(&mut locator_evaluations, omega);
+
+        // Exact division N(X)/Z(X) becomes a truncated series product after
+        // reversing both polynomials; Newton iteration prepares 1/rev(Z).
+        let reversed_locator: Vec<_> = locator.iter().rev().copied().collect();
+        let reversed_locator_inverse = invert_series(&reversed_locator, profile.k);
+        Some(Self {
+            profile,
+            known_indices: known_indices.to_vec(),
+            locator_evaluations,
+            reversed_locator_inverse,
+            numerator_max_degree: profile.k + erased_points.len() - 1,
+        })
+    }
+
+    /// Recovers one systematic row using numerator IFFT, fast exact division, and a systematic-domain FFT.
+    pub fn reconstruct_blob(&self, values: &[F]) -> Option<Blob> {
+        if values.len() != self.known_indices.len() {
+            return None;
+        }
+
+        // N(X)=f(X)Z(X). Its complete evaluation vector is known: it is zero
+        // at erasures and equals the received value times Z at known points.
+        let mut numerator = vec![F::ZERO; self.profile.m];
+        for (&index, &value) in self.known_indices.iter().zip(values) {
+            numerator[index] = value * self.locator_evaluations[index];
+        }
+        let omega = F::two_adic_generator(self.profile.m.ilog2() as usize);
+        ifft(&mut numerator, omega);
+
+        let reversed_numerator: Vec<_> = (0..self.profile.k)
+            .map(|offset| numerator[self.numerator_max_degree - offset])
+            .collect();
+        let mut reversed_coefficients = multiply_polynomials(&reversed_numerator, &self.reversed_locator_inverse);
+        reversed_coefficients.truncate(self.profile.k);
+        reversed_coefficients.reverse();
+
+        let systematic_root = omega.exp_u64(self.profile.systematic_stride() as u64);
+        fft(&mut reversed_coefficients, systematic_root);
+        Some(reversed_coefficients)
+    }
+}
+
+/// Reconstructs one systematic blob from arbitrary indexed codeword evaluations.
+pub fn reconstruct_blob(profile: ParameterProfile, samples: &[(usize, F)]) -> Option<Blob> {
+    if samples.len() < profile.k {
+        return None;
+    }
+    let indices: Vec<_> = samples.iter().map(|(index, _)| *index).collect();
+    let values: Vec<_> = samples.iter().map(|(_, value)| *value).collect();
+    ErasureDecoder::new(profile, &indices)?.reconstruct_blob(&values)
+}
+
+/// Produces deterministic field-element input data for a selected profile.
+pub fn demo_data(profile: ParameterProfile) -> Data {
+    (0..profile.n)
+        .map(|row| {
+            (0..profile.k)
+                .map(|col| F::from_usize(1 + row * profile.k + col))
+                .collect()
+        })
+        .collect()
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    /// Checks generalized systematic FFT encoding on the tiny profile.
+    fn encoding_is_systematic() {
+        let profile = ParameterProfile::TINY;
+        let data = demo_data(profile);
+        let codewords = encode(profile, &data);
+        for row in 0..profile.n {
+            let systematic: Vec<_> = (0..profile.k)
+                .map(|i| codewords[row][i * profile.systematic_stride()])
+                .collect();
+            assert_eq!(systematic, data[row]);
+        }
+    }
+
+    #[test]
+    /// Recovers a row from an arbitrary non-subgroup half of its codeword positions.
+    fn arbitrary_erasure_reconstruction() {
+        let profile = ParameterProfile::MEDIUM;
+        let blob = demo_data(profile).remove(0);
+        let codeword = encode_blob(profile, &blob);
+        let indices: Vec<_> = (0..profile.k).map(|i| (37 * i + 11) % profile.m).collect();
+        let samples: Vec<_> = indices.iter().map(|&index| (index, codeword[index])).collect();
+        assert_eq!(reconstruct_blob(profile, &samples).unwrap(), blob);
+    }
+}
diff --git a/crates/pq_das/src/hashing.rs b/crates/pq_das/src/hashing.rs
new file mode 100644
index 000000000..fc495547e
--- /dev/null
+++ b/crates/pq_das/src/hashing.rs
@@ -0,0 +1,59 @@
+use backend::{poseidon_hash_slice, poseidon16_compress_pair};
+use lean_vm::F;
+
+use crate::{
+    Commitment,
+    config::{DIGEST_LEN, ParameterProfile, fs_block},
+    encoding::{Codeword, Codewords},
+};
+
+pub type Digest = [F; DIGEST_LEN];
+
+/// Hashes the systematic symbols of one codeword into its public row digest.
+pub fn row_hash(profile: ParameterProfile, row: &Codeword) -> Digest {
+    let systematic: Vec<F> = (0..profile.k).map(|i| row[i * profile.systematic_stride()]).collect();
+    poseidon_hash_slice(&systematic)
+}
+
+/// Hashes all rows' symbols in one cell column into a Merkle leaf.
+pub fn column_hash(profile: ParameterProfile, codewords: &Codewords, cell: usize) -> Digest {
+    let start = cell * profile.c;
+    let mut values = Vec::with_capacity(profile.n * profile.c);
+    for row in codewords {
+        values.extend_from_slice(&row[start..start + profile.c]);
+    }
+    poseidon_hash_slice(&values)
+}
+
+/// Builds a complete binary Poseidon Merkle tree from a power-of-two leaf set.
+pub fn merkle_layers(leaves: &[Digest]) -> Vec<Vec<Digest>> {
+    assert!(leaves.len().is_power_of_two());
+    let mut layers = vec![leaves.to_vec()];
+    while layers.last().unwrap().len() > 1 {
+        let next = layers
+            .last()
+            .unwrap()
+            .chunks_exact(2)
+            .map(|pair| poseidon16_compress_pair(&pair[0], &pair[1]))
+            .collect();
+        layers.push(next);
+    }
+    layers
+}
+
+/// Returns the root of a complete binary Poseidon Merkle tree.
+pub fn merkle_root(leaves: &[Digest]) -> Digest {
+    merkle_layers(leaves).last().unwrap()[0]
+}
+
+/// Derives the public RS Fiat-Shamir digest from the profile and commitment.
+pub fn fiat_shamir_digest(commitment: &Commitment) -> Digest {
+    let mut values = Vec::with_capacity((3 + commitment.profile.n) * DIGEST_LEN);
+    values.extend_from_slice(&fs_block());
+    values.extend_from_slice(&commitment.profile.profile_block());
+    for hash in &commitment.row_hashes {
+        values.extend_from_slice(hash);
+    }
+    values.extend_from_slice(&commitment.root);
+    poseidon_hash_slice(&values)
+}
diff --git a/crates/pq_das/src/lib.rs b/crates/pq_das/src/lib.rs
new file mode 100644
index 000000000..e75f92044
--- /dev/null
+++ b/crates/pq_das/src/lib.rs
@@ -0,0 +1,346 @@
+mod config;
+mod encoding;
+mod hashing;
+mod membership;
+mod protocol;
+
+use std::{
+    collections::BTreeMap,
+    fmt::{Display, Formatter},
+};
+
+use backend::{ArenaVec, PrimeCharacteristicRing, arena_vec};
+use lean_compiler::{CompilationFlags, ProgramSource, compile_program_with_flags};
+use lean_prover::{
+    default_whir_config,
+    prove_execution::{ExecutionProof, prove_execution},
+    verify_execution::verify_execution,
+};
+use lean_vm::{Bytecode, ExecutionWitness, F, Hints};
+
+pub use config::*;
+pub use encoding::{Blob, Codeword, Codewords, Data, ErasureDecoder, demo_data, encode, encode_blob};
+pub use hashing::Digest;
+pub use protocol::{
+    AuxiliaryData, CellOpening, Transcript, commit, commitment_size_bytes, encode_and_commit, query, reconstruct,
+    sample_query_indices, systematic_symbols_per_cell, transcript_size_bytes, verify, verify_openings,
+};
+
+#[derive(Clone, Debug, PartialEq, Eq)]
+pub struct Commitment {
+    pub profile: ParameterProfile,
+    pub row_hashes: Vec<Digest>,
+    pub root: Digest,
+}
+
+#[derive(Debug, Clone)]
+pub struct ProofBundle {
+    pub execution: ExecutionProof,
+}
+
+#[derive(Debug, Clone)]
+pub struct PreparedStatement {
+    /// Public profile, row hashes, and Merkle root represented by this statement.
+    pub commitment: Commitment,
+    /// Public special-barycentric vector generated once during preparation.
+    pub check_vector: membership::CheckVector,
+    /// Compact profile bytecode with the statement values bound read-only.
+    pub bytecode: Bytecode,
+}
+
+#[derive(Debug)]
+pub enum DemoError {
+    InvalidDataShape,
+    InvalidQuery,
+    InvalidOpening,
+    InsufficientCells,
+    ReconstructionFailed,
+    ChallengeOnDomain,
+    Profile(ProfileError),
+    Prover(lean_prover::ProverError),
+    Verification(backend::ProofError),
+}
+
+impl Display for DemoError {
+    /// Formats each demo failure as a concise user-facing diagnostic.
+    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
+        match self {
+            Self::InvalidDataShape => write!(f, "data dimensions do not match the selected profile"),
+            Self::InvalidQuery => write!(f, "query contains an invalid or duplicate cell index"),
+            Self::InvalidOpening => write!(f, "invalid cell opening"),
+            Self::InsufficientCells => write!(f, "not enough distinct cells to reconstruct"),
+            Self::ReconstructionFailed => write!(f, "RS reconstruction failed"),
+            Self::ChallengeOnDomain => write!(f, "Fiat-Shamir challenge lies on the interpolation domain"),
+            Self::Profile(err) => write!(f, "invalid parameter profile: {err}"),
+            Self::Prover(err) => write!(f, "LeanVM prover failed: {err}"),
+            Self::Verification(err) => write!(f, "LeanVM verification failed: {err}"),
+        }
+    }
+}
+
+impl std::error::Error for DemoError {}
+
+impl From<ProfileError> for DemoError {
+    /// Converts profile validation failures into the demo's unified error type.
+    fn from(value: ProfileError) -> Self {
+        Self::Profile(value)
+    }
+}
+
+impl From<lean_prover::ProverError> for DemoError {
+    /// Converts LeanVM prover failures into the demo's unified error type.
+    fn from(value: lean_prover::ProverError) -> Self {
+        Self::Prover(value)
+    }
+}
+
+/// Loads the embedded generalized zkDSL verifier program.
+fn guest_source() -> ProgramSource {
+    ProgramSource::Raw(include_str!("../zkdsl/main.py").to_string())
+}
+
+/// Creates profile-specific zkDSL replacements and read-only memory pointers.
+fn compilation_flags(commitment: &Commitment) -> Result<CompilationFlags, DemoError> {
+    commitment.profile.validate()?;
+    if commitment.row_hashes.len() != commitment.profile.n {
+        return Err(DemoError::InvalidDataShape);
+    }
+    let profile = commitment.profile;
+    let mut replacements = BTreeMap::new();
+    for (name, value) in [
+        ("N_PLACEHOLDER", profile.n),
+        ("M_PLACEHOLDER", profile.m),
+        ("K_PLACEHOLDER", profile.k),
+        ("C_PLACEHOLDER", profile.c),
+        ("N_CELLS_PLACEHOLDER", profile.n_cells()),
+        ("SYSTEMATIC_STRIDE_PLACEHOLDER", profile.systematic_stride()),
+        ("ROW_CHUNKS_PLACEHOLDER", profile.k / DIGEST_LEN),
+        ("COLUMN_CHUNKS_PLACEHOLDER", profile.n * profile.c / DIGEST_LEN),
+        ("MERKLE_DEPTH_PLACEHOLDER", profile.merkle_depth()),
+        ("TREE_DIGESTS_PLACEHOLDER", 2 * profile.n_cells() - 1),
+        ("PUBLIC_ROW_HASHES_PTR_PLACEHOLDER", DIGEST_LEN),
+        ("PUBLIC_ROOT_PTR_PLACEHOLDER", DIGEST_LEN + profile.n * DIGEST_LEN),
+        ("CHECK_VECTOR_PTR_PLACEHOLDER", 2 * DIGEST_LEN + profile.n * DIGEST_LEN),
+    ] {
+        replacements.insert(name.to_string(), value.to_string());
+    }
+
+    let mut sizes = Vec::with_capacity(profile.merkle_depth() + 1);
+    let mut offsets = Vec::with_capacity(profile.merkle_depth() + 1);
+    let mut size = profile.n_cells();
+    let mut offset = 0;
+    loop {
+        sizes.push(size);
+        offsets.push(offset);
+        if size == 1 {
+            break;
+        }
+        offset += size;
+        size /= 2;
+    }
+    replacements.insert(
+        "LEVEL_SIZES_PLACEHOLDER".to_string(),
+        format!("[{}]", sizes.iter().map(usize::to_string).collect::<Vec<_>>().join(",")),
+    );
+    replacements.insert(
+        "LEVEL_OFFSETS_PLACEHOLDER".to_string(),
+        format!(
+            "[{}]",
+            offsets.iter().map(usize::to_string).collect::<Vec<_>>().join(",")
+        ),
+    );
+    Ok(CompilationFlags { replacements })
+}
+
+/// Returns the legacy fixed public-input block; statement data lives read-only.
+fn leanvm_public_input() -> [F; DIGEST_LEN] {
+    [F::ZERO; DIGEST_LEN]
+}
+
+/// Flattens public hashes, root, and L into LeanVM's bound read-only segment.
+fn read_only_data(commitment: &Commitment, check_vector: &membership::CheckVector) -> Vec<F> {
+    let mut data =
+        Vec::with_capacity(commitment.profile.n * DIGEST_LEN + DIGEST_LEN + commitment.profile.m * EXT_DEGREE);
+    data.extend(commitment.row_hashes.iter().flatten().copied());
+    data.extend_from_slice(&commitment.root);
+    data.extend(check_vector.iter().flatten().copied());
+    data
+}
+
+/// Generates L once and compiles the reusable statement-bound LeanVM bytecode.
+pub fn prepare_statement(commitment: Commitment) -> Result<PreparedStatement, DemoError> {
+    let check_vector = membership::check_vector(&commitment).ok_or(DemoError::ChallengeOnDomain)?;
+    let bytecode = compile_program_with_flags(&guest_source(), compilation_flags(&commitment)?)
+        .with_read_only_data(read_only_data(&commitment, &check_vector));
+    Ok(PreparedStatement {
+        commitment,
+        check_vector,
+        bytecode,
+    })
+}
+
+/// Packages only the private codeword matrix into the LeanVM witness.
+fn witness(bytecode: &Bytecode, codewords: &Codewords) -> ExecutionWitness {
+    let flattened: Vec<_> = codewords.iter().flat_map(|row| row.iter().copied()).collect();
+    let mut hints = Hints::default();
+    hints.insert(bytecode, "codewords", arena_vec![ArenaVec::from_slice(&flattened)]);
+    ExecutionWitness {
+        hints,
+        ..Default::default()
+    }
+}
+
+/// Proves the prepared row-hash, Merkle-root, and RS dot-product statement.
+pub fn prove_codewords(prepared: &PreparedStatement, codewords: &Codewords) -> Result<ProofBundle, DemoError> {
+    let profile = prepared.commitment.profile;
+    if codewords.len() != profile.n || codewords.iter().any(|row| row.len() != profile.m) {
+        return Err(DemoError::InvalidDataShape);
+    }
+    let execution = prove_execution(
+        &prepared.bytecode,
+        &leanvm_public_input(),
+        &witness(&prepared.bytecode, codewords),
+        &default_whir_config(profile.whir_log_inv_rate),
+        false,
+    )?;
+    Ok(ProofBundle { execution })
+}
+
+/// Recomputes Fiat-Shamir and L from the public commitment before verifying.
+pub fn verify_execution_proof(commitment: &Commitment, proof: &ProofBundle) -> Result<(), DemoError> {
+    // The verifier never accepts L from the prover. Re-preparing the statement
+    // independently binds the proof to the unique L derived from public data.
+    let prepared = prepare_statement(commitment.clone())?;
+    verify_execution(
+        &prepared.bytecode,
+        &leanvm_public_input(),
+        proof.execution.proof.clone(),
+    )
+    .map(|_| ())
+    .map_err(DemoError::Verification)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use backend::PrimeCharacteristicRing;
+
+    /// Creates native commitment material without invoking the LeanVM prover.
+    fn native_material(profile: ParameterProfile) -> (Data, Commitment, AuxiliaryData) {
+        let data = demo_data(profile);
+        let codewords = encode(profile, &data);
+        let row_hashes = codewords.iter().map(|row| hashing::row_hash(profile, row)).collect();
+        let leaves: Vec<_> = (0..profile.n_cells())
+            .map(|cell| hashing::column_hash(profile, &codewords, cell))
+            .collect();
+        let merkle_layers = hashing::merkle_layers(&leaves);
+        let commitment = Commitment {
+            profile,
+            row_hashes,
+            root: merkle_layers.last().unwrap()[0],
+        };
+        let aux = AuxiliaryData {
+            profile,
+            codewords,
+            merkle_layers,
+        };
+        (data, commitment, aux)
+    }
+
+    #[test]
+    /// Exercises generalized native commitments, openings, and reconstruction.
+    fn native_protocol_roundtrip_and_tamper_detection() {
+        let profile = ParameterProfile::TINY;
+        let (data, commitment, aux) = native_material(profile);
+
+        let transcript = query(&aux, &[1]).unwrap();
+        assert!(verify_openings(&commitment, &transcript));
+        assert_eq!(
+            reconstruct(&commitment, std::slice::from_ref(&transcript)).unwrap(),
+            data
+        );
+
+        let mut bad = transcript;
+        bad.openings[0].cells[0][0] += F::ONE;
+        assert!(!verify_openings(&commitment, &bad));
+    }
+
+    #[test]
+    /// Confirms all built-in profiles satisfy the implementation constraints.
+    fn predefined_profiles_are_valid() {
+        for profile in [
+            ParameterProfile::TINY,
+            ParameterProfile::MEDIUM,
+            ParameterProfile::LARGE,
+            ParameterProfile::STRESS,
+            ParameterProfile::BLOB_128K_1,
+            ParameterProfile::BLOB_128K_4,
+            ParameterProfile::BLOB_128K_16,
+        ] {
+            profile.validate().unwrap();
+        }
+    }
+
+    #[test]
+    /// Pins the minimum distinct-cell counts that match LeanVM's availability-soundness target.
+    fn predefined_profiles_have_expected_sampling_counts() {
+        for (profile, expected) in [
+            (ParameterProfile::TINY, 1),
+            (ParameterProfile::MEDIUM, 16),
+            (ParameterProfile::LARGE, 63),
+            (ParameterProfile::STRESS, 105),
+            (ParameterProfile::BLOB_128K_1, 114),
+            (ParameterProfile::BLOB_128K_4, 114),
+            (ParameterProfile::BLOB_128K_16, 114),
+        ] {
+            let count = profile.sampling_count(SAMPLING_SOUNDNESS_BITS);
+            assert_eq!(count, expected);
+            assert!(profile.sampling_log2_failure(count) <= -(SAMPLING_SOUNDNESS_BITS as f64));
+            if count > 1 {
+                assert!(profile.sampling_log2_failure(count - 1) > -(SAMPLING_SOUNDNESS_BITS as f64));
+            }
+        }
+    }
+
+    #[test]
+    /// Proves and verifies the complete generalized construction on the tiny profile.
+    fn leanvm_end_to_end() {
+        backend::parallel::init();
+        let profile = ParameterProfile::TINY;
+        let data = demo_data(profile);
+        let (prepared, aux, proof) = commit(profile, &data).unwrap();
+        let query_indices = sample_query_indices(&prepared.commitment, &[F::from_u32(42); DIGEST_LEN], 1).unwrap();
+        let transcript = query(&aux, &query_indices).unwrap();
+        assert!(verify(&prepared.commitment, &proof, &transcript).unwrap());
+        assert_eq!(reconstruct(&prepared.commitment, &[transcript]).unwrap(), data);
+
+        let mut wrong_commitment = prepared.commitment.clone();
+        wrong_commitment.root[0] += F::ONE;
+        assert!(verify_execution_proof(&wrong_commitment, &proof).is_err());
+
+        let mut wrong_row_hash = prepared.commitment.clone();
+        wrong_row_hash.row_hashes[0][0] += F::ONE;
+        assert!(verify_execution_proof(&wrong_row_hash, &proof).is_err());
+    }
+
+    #[test]
+    /// Confirms the guest rejects a row that violates its public RS check vector.
+    fn guest_rejects_non_member_codeword() {
+        backend::parallel::init();
+        let profile = ParameterProfile::TINY;
+        let (_, mut commitment, aux) = native_material(profile);
+        let mut codewords = aux.codewords;
+        codewords[0][1] += F::ONE;
+        commitment.row_hashes = codewords.iter().map(|row| hashing::row_hash(profile, row)).collect();
+        let leaves: Vec<_> = (0..profile.n_cells())
+            .map(|cell| hashing::column_hash(profile, &codewords, cell))
+            .collect();
+        commitment.root = hashing::merkle_root(&leaves);
+        let prepared = prepare_statement(commitment).unwrap();
+        let execution_witness = witness(&prepared.bytecode, &codewords);
+        assert!(
+            lean_vm::try_execute_bytecode(&prepared.bytecode, &leanvm_public_input(), &execution_witness, false)
+                .is_err()
+        );
+    }
+}
diff --git a/crates/pq_das/src/main.rs b/crates/pq_das/src/main.rs
new file mode 100644
index 000000000..95d8f91f7
--- /dev/null
+++ b/crates/pq_das/src/main.rs
@@ -0,0 +1,177 @@
+use std::time::Instant;
+
+use backend::{PrimeCharacteristicRing, PrimeField32};
+use clap::{Parser, ValueEnum};
+use lean_vm::F;
+use pq_das::{
+    DIGEST_LEN, ParameterProfile, SAMPLING_SOUNDNESS_BITS, commitment_size_bytes, demo_data, encode_and_commit,
+    prepare_statement, prove_codewords, query, reconstruct, sample_query_indices, transcript_size_bytes,
+    verify_execution_proof, verify_openings,
+};
+
+#[derive(Clone, Copy, Debug, ValueEnum)]
+enum ProfileName {
+    Tiny,
+    Medium,
+    Large,
+    Stress,
+    #[value(name = "blob-128k-1")]
+    Blob128K1,
+    #[value(name = "blob-128k-4")]
+    Blob128K4,
+    #[value(name = "blob-128k-16")]
+    Blob128K16,
+    Custom,
+}
+
+#[derive(Debug, Parser)]
+#[command(about = "Benchmark the parameterized PQ-DAS LeanVM construction")]
+struct Cli {
+    #[arg(long, value_enum, default_value_t = ProfileName::Tiny)]
+    profile: ProfileName,
+
+    #[arg(long)]
+    n: Option<usize>,
+
+    #[arg(long)]
+    m: Option<usize>,
+
+    #[arg(long)]
+    k: Option<usize>,
+
+    #[arg(long)]
+    c: Option<usize>,
+
+    #[arg(long, default_value_t = 1)]
+    whir_log_inv_rate: usize,
+
+    #[arg(long)]
+    skip_reconstruction: bool,
+}
+
+impl Cli {
+    /// Resolves a named or custom CLI selection into a validated parameter profile.
+    fn selected_profile(&self) -> Result<ParameterProfile, Box<dyn std::error::Error>> {
+        let mut profile = match self.profile {
+            ProfileName::Tiny => ParameterProfile::TINY,
+            ProfileName::Medium => ParameterProfile::MEDIUM,
+            ProfileName::Large => ParameterProfile::LARGE,
+            ProfileName::Stress => ParameterProfile::STRESS,
+            ProfileName::Blob128K1 => ParameterProfile::BLOB_128K_1,
+            ProfileName::Blob128K4 => ParameterProfile::BLOB_128K_4,
+            ProfileName::Blob128K16 => ParameterProfile::BLOB_128K_16,
+            ProfileName::Custom => ParameterProfile::custom(
+                self.n.ok_or("custom profile requires --n")?,
+                self.m.ok_or("custom profile requires --m")?,
+                self.k.ok_or("custom profile requires --k")?,
+                self.c.ok_or("custom profile requires --c")?,
+                self.whir_log_inv_rate,
+            )?,
+        };
+        profile.whir_log_inv_rate = self.whir_log_inv_rate;
+        profile.validate()?;
+        Ok(profile)
+    }
+}
+
+/// Runs one complete benchmark for the selected generalized PQ-DAS profile.
+fn main() -> Result<(), Box<dyn std::error::Error>> {
+    backend::parallel::init();
+    let cli = Cli::parse();
+    let profile = cli.selected_profile()?;
+    let data = demo_data(profile);
+
+    let started = Instant::now();
+    let (commitment, aux) = encode_and_commit(profile, &data)?;
+    let commitment_time = started.elapsed();
+
+    let started = Instant::now();
+    let prepared = prepare_statement(commitment)?;
+    let preprocessing_time = started.elapsed();
+
+    let started = Instant::now();
+    let proof = prove_codewords(&prepared, &aux.codewords)?;
+    let proving_time = started.elapsed();
+
+    let sample_count = profile.sampling_count(SAMPLING_SOUNDNESS_BITS);
+    let indices = sample_query_indices(&prepared.commitment, &[F::from_u32(42); DIGEST_LEN], sample_count)?;
+    let transcript = query(&aux, &indices)?;
+
+    let started = Instant::now();
+    verify_execution_proof(&prepared.commitment, &proof)?;
+    let proof_verification_time = started.elapsed();
+
+    let started = Instant::now();
+    let openings_accepted = verify_openings(&prepared.commitment, &transcript);
+    let opening_verification_time = started.elapsed();
+    let accepted = openings_accepted;
+    let (reconstruction, reconstruction_time) = if cli.skip_reconstruction {
+        (None, None)
+    } else {
+        let reconstruction_indices = sample_query_indices(
+            &prepared.commitment,
+            &[F::from_u32(84); DIGEST_LEN],
+            profile.reconstruction_threshold_cells(),
+        )?;
+        let reconstruction_transcript = query(&aux, &reconstruction_indices)?;
+        let started = Instant::now();
+        let correct = reconstruct(&prepared.commitment, &[reconstruction_transcript])? == data;
+        (Some(correct), Some(started.elapsed()))
+    };
+    let commitment_bytes = commitment_size_bytes(&prepared.commitment);
+    let proof_field_elements = proof.execution.proof.proof_size_fe();
+    let proof_bytes = proof_field_elements * size_of::<u32>();
+    let sample_bytes = transcript_size_bytes(&transcript);
+
+    println!("PQ-DAS LeanVM demo");
+    println!(
+        "profile: {} (n={}, m={}, k={}, rho={}/{}, c={}, cells={}, threshold={})",
+        profile.name,
+        profile.n,
+        profile.m,
+        profile.k,
+        profile.k,
+        profile.m,
+        profile.c,
+        profile.n_cells(),
+        profile.reconstruction_threshold_cells(),
+    );
+    println!(
+        "root: {:?}",
+        prepared.commitment.root.map(|value| value.as_canonical_u32())
+    );
+    println!("proof accepted: {accepted}");
+    println!("bytecode instructions: {}", prepared.bytecode.size());
+    println!(
+        "read-only public elements: {}",
+        prepared.bytecode.read_only_data().len()
+    );
+    println!("sampling soundness target: {SAMPLING_SOUNDNESS_BITS} bits");
+    println!(
+        "sampling log2 failure bound: {:.3}",
+        profile.sampling_log2_failure(sample_count)
+    );
+    println!("opened cells: {sample_count}");
+    println!("commitment size: {commitment_bytes} bytes");
+    println!("proof size: {proof_field_elements} field elements ({proof_bytes} bytes)");
+    println!("sample size: {sample_bytes} bytes");
+    match reconstruction {
+        Some(correct) => println!("reconstruction correct: {correct}"),
+        None => println!("reconstruction: skipped"),
+    }
+    if let Some(elapsed) = reconstruction_time {
+        println!("reconstruction time: {:.3}s", elapsed.as_secs_f64());
+    }
+    println!("encoding + commitment time: {:.3}s", commitment_time.as_secs_f64());
+    println!("prover preprocessing time: {:.3}s", preprocessing_time.as_secs_f64());
+    println!("LeanVM proving time: {:.3}s", proving_time.as_secs_f64());
+    println!(
+        "verifier statement rebuild + LeanVM proof verification time: {:.3}s",
+        proof_verification_time.as_secs_f64()
+    );
+    println!(
+        "opening verification time: {:.3}s",
+        opening_verification_time.as_secs_f64()
+    );
+    Ok(())
+}
diff --git a/crates/pq_das/src/membership.rs b/crates/pq_das/src/membership.rs
new file mode 100644
index 000000000..872f1a5c9
--- /dev/null
+++ b/crates/pq_das/src/membership.rs
@@ -0,0 +1,74 @@
+use backend::{BasedVectorSpace, Field, PrimeCharacteristicRing, TwoAdicField};
+use lean_vm::{EF, F};
+
+use crate::{Commitment, config::EXT_DEGREE, hashing::Digest};
+
+pub type CheckVector = Vec<[F; EXT_DEGREE]>;
+
+/// Interprets the first five digest coordinates as one quintic-extension element.
+fn ext_from_digest(digest: &Digest) -> EF {
+    EF::from_basis_coefficients_slice(&digest[..EXT_DEGREE]).unwrap()
+}
+
+/// Serializes one quintic-extension element into its five KoalaBear coordinates.
+fn coeffs(value: EF) -> [F; EXT_DEGREE] {
+    value.as_basis_coefficients_slice().try_into().unwrap()
+}
+
+/// Derives the extension-field special-barycentric challenge from public data.
+pub fn challenge(commitment: &Commitment) -> EF {
+    ext_from_digest(&crate::hashing::fiat_shamir_digest(commitment))
+}
+
+/// Computes the public half-rate special-barycentric check vector.
+pub fn check_vector(commitment: &Commitment) -> Option<CheckVector> {
+    let profile = commitment.profile;
+    let omega = F::two_adic_generator(profile.m.ilog2() as usize);
+    let omega_sq = omega.square();
+    let p = challenge(commitment);
+    let q = p / EF::from(omega);
+    let h_inv = F::from_usize(profile.k).inverse();
+    let common_p = (p.exp_u64(profile.k as u64) - EF::ONE) * EF::from(h_inv);
+    let common_q = (q.exp_u64(profile.k as u64) - EF::ONE) * EF::from(h_inv);
+
+    let mut xs = Vec::with_capacity(profile.k);
+    let mut denominator_inverses = Vec::with_capacity(profile.m);
+    let mut x = F::ONE;
+    for _ in 0..profile.k {
+        if p == EF::from(x) || q == EF::from(x) {
+            return None;
+        }
+        xs.push(x);
+        denominator_inverses.push(p - EF::from(x));
+        denominator_inverses.push(q - EF::from(x));
+        x *= omega_sq;
+    }
+
+    // Montgomery's trick replaces 2k extension-field inversions with one
+    // inversion and O(k) multiplications.
+    batch_invert(&mut denominator_inverses);
+
+    let mut vector = vec![[F::ZERO; EXT_DEGREE]; profile.m];
+    for (r, x) in xs.into_iter().enumerate() {
+        vector[2 * r] = coeffs(common_p * EF::from(x) * denominator_inverses[2 * r]);
+        vector[2 * r + 1] = coeffs(-(common_q * EF::from(x) * denominator_inverses[2 * r + 1]));
+    }
+    Some(vector)
+}
+
+/// Batch-inverts nonzero extension elements using one actual field inversion.
+fn batch_invert(values: &mut [EF]) {
+    let mut accumulator = EF::ONE;
+    let mut prefixes = Vec::with_capacity(values.len());
+    for &value in values.iter() {
+        prefixes.push(accumulator);
+        accumulator *= value;
+    }
+
+    let mut inverse = accumulator.inverse();
+    for (value, prefix) in values.iter_mut().zip(prefixes).rev() {
+        let original = *value;
+        *value = inverse * prefix;
+        inverse *= original;
+    }
+}
diff --git a/crates/pq_das/src/protocol.rs b/crates/pq_das/src/protocol.rs
new file mode 100644
index 000000000..4b6208f71
--- /dev/null
+++ b/crates/pq_das/src/protocol.rs
@@ -0,0 +1,233 @@
+use std::collections::{BTreeSet, HashMap};
+
+use backend::{PrimeCharacteristicRing, PrimeField32, poseidon_hash_slice, poseidon16_compress_pair};
+use lean_vm::F;
+
+use crate::{
+    Commitment, DIGEST_LEN, DemoError, ParameterProfile, PreparedStatement, ProofBundle,
+    encoding::{Codewords, Data, ErasureDecoder, encode},
+    hashing::{Digest, column_hash, merkle_layers, row_hash},
+    prepare_statement, prove_codewords, verify_execution_proof,
+};
+
+#[derive(Clone, Debug)]
+pub struct AuxiliaryData {
+    pub profile: ParameterProfile,
+    pub codewords: Codewords,
+    pub merkle_layers: Vec<Vec<Digest>>,
+}
+
+#[derive(Clone, Debug, PartialEq, Eq)]
+pub struct CellOpening {
+    pub index: usize,
+    pub cells: Vec<Vec<F>>,
+    pub authentication_path: Vec<Digest>,
+}
+
+#[derive(Clone, Debug, PartialEq, Eq)]
+pub struct Transcript {
+    pub openings: Vec<CellOpening>,
+}
+
+/// Encodes data and constructs the public row and column commitments.
+pub fn encode_and_commit(profile: ParameterProfile, data: &Data) -> Result<(Commitment, AuxiliaryData), DemoError> {
+    profile.validate()?;
+    if data.len() != profile.n || data.iter().any(|blob| blob.len() != profile.k) {
+        return Err(DemoError::InvalidDataShape);
+    }
+    let codewords = encode(profile, data);
+    let row_hashes = codewords.iter().map(|row| row_hash(profile, row)).collect();
+    let leaves: Vec<_> = (0..profile.n_cells())
+        .map(|cell| column_hash(profile, &codewords, cell))
+        .collect();
+    let merkle_layers = merkle_layers(&leaves);
+    let commitment = Commitment {
+        profile,
+        row_hashes,
+        root: merkle_layers.last().unwrap()[0],
+    };
+    Ok((
+        commitment,
+        AuxiliaryData {
+            profile,
+            codewords,
+            merkle_layers,
+        },
+    ))
+}
+
+/// Runs the complete convenience workflow while retaining the prepared statement.
+pub fn commit(
+    profile: ParameterProfile,
+    data: &Data,
+) -> Result<(PreparedStatement, AuxiliaryData, ProofBundle), DemoError> {
+    let (commitment, aux) = encode_and_commit(profile, data)?;
+    let prepared = prepare_statement(commitment)?;
+    let proof = prove_codewords(&prepared, &aux.codewords)?;
+    Ok((prepared, aux, proof))
+}
+
+/// Opens requested cell columns and attaches their complete Merkle paths.
+pub fn query(aux: &AuxiliaryData, indices: &[usize]) -> Result<Transcript, DemoError> {
+    let profile = aux.profile;
+    let mut seen = BTreeSet::new();
+    let mut openings = Vec::with_capacity(indices.len());
+    for &index in indices {
+        if index >= profile.n_cells() || !seen.insert(index) {
+            return Err(DemoError::InvalidQuery);
+        }
+        let start = index * profile.c;
+        let cells = aux
+            .codewords
+            .iter()
+            .map(|row| row[start..start + profile.c].to_vec())
+            .collect();
+        let mut node = index;
+        let mut authentication_path = Vec::with_capacity(profile.merkle_depth());
+        for layer in aux.merkle_layers.iter().take(profile.merkle_depth()) {
+            authentication_path.push(layer[node ^ 1]);
+            node /= 2;
+        }
+        openings.push(CellOpening {
+            index,
+            cells,
+            authentication_path,
+        });
+    }
+    Ok(Transcript { openings })
+}
+
+/// Derives the requested number of distinct cell indices from public randomness.
+pub fn sample_query_indices(
+    commitment: &Commitment,
+    randomness: &[F; DIGEST_LEN],
+    count: usize,
+) -> Result<Vec<usize>, DemoError> {
+    let n_cells = commitment.profile.n_cells();
+    if count == 0 || count > n_cells {
+        return Err(DemoError::InvalidQuery);
+    }
+
+    let mut state = poseidon16_compress_pair(&commitment.root, randomness);
+    let mut counter = 0;
+    let mut seen = BTreeSet::new();
+    let mut indices = Vec::with_capacity(count);
+    while indices.len() < count {
+        for value in state {
+            let canonical = value.as_canonical_u32();
+            let modulus = F::ORDER_U32;
+            let unbiased_limit = modulus - modulus % n_cells as u32;
+            if canonical >= unbiased_limit {
+                continue;
+            }
+            let index = canonical as usize % n_cells;
+            if seen.insert(index) {
+                indices.push(index);
+                if indices.len() == count {
+                    break;
+                }
+            }
+        }
+        counter += 1;
+        let mut counter_block = [F::ZERO; DIGEST_LEN];
+        counter_block[0] = F::from_usize(counter);
+        state = poseidon16_compress_pair(&state, &counter_block);
+    }
+    Ok(indices)
+}
+
+/// Returns the canonical byte size of the public row hashes and Merkle root.
+pub fn commitment_size_bytes(commitment: &Commitment) -> usize {
+    (commitment.row_hashes.len() + 1) * DIGEST_LEN * size_of::<u32>()
+}
+
+/// Returns the canonical byte size of queried indices, cells, and Merkle paths.
+pub fn transcript_size_bytes(transcript: &Transcript) -> usize {
+    transcript
+        .openings
+        .iter()
+        .map(|opening| {
+            size_of::<u32>()
+                + opening.cells.iter().map(Vec::len).sum::<usize>() * size_of::<u32>()
+                + opening.authentication_path.len() * DIGEST_LEN * size_of::<u32>()
+        })
+        .sum()
+}
+
+/// Rehashes each opened column and verifies its full path against the public root.
+pub fn verify_openings(commitment: &Commitment, transcript: &Transcript) -> bool {
+    let profile = commitment.profile;
+    let mut seen = BTreeSet::new();
+    transcript.openings.iter().all(|opening| {
+        if opening.index >= profile.n_cells()
+            || !seen.insert(opening.index)
+            || opening.cells.len() != profile.n
+            || opening.cells.iter().any(|cell| cell.len() != profile.c)
+            || opening.authentication_path.len() != profile.merkle_depth()
+        {
+            return false;
+        }
+        let values: Vec<F> = opening.cells.iter().flatten().copied().collect();
+        let mut digest = poseidon_hash_slice(&values);
+        let mut node = opening.index;
+        for sibling in &opening.authentication_path {
+            digest = if node.is_multiple_of(2) {
+                poseidon16_compress_pair(&digest, sibling)
+            } else {
+                poseidon16_compress_pair(sibling, &digest)
+            };
+            node /= 2;
+        }
+        digest == commitment.root
+    })
+}
+
+/// Rebuilds the public LeanVM statement, verifies its proof, and checks openings.
+pub fn verify(commitment: &Commitment, proof: &ProofBundle, transcript: &Transcript) -> Result<bool, DemoError> {
+    verify_execution_proof(commitment, proof)?;
+    Ok(verify_openings(commitment, transcript))
+}
+
+/// Collects distinct valid cells and reconstructs every original systematic blob.
+pub fn reconstruct(commitment: &Commitment, transcripts: &[Transcript]) -> Result<Data, DemoError> {
+    let profile = commitment.profile;
+    let mut openings = HashMap::new();
+    for transcript in transcripts {
+        if !verify_openings(commitment, transcript) {
+            return Err(DemoError::InvalidOpening);
+        }
+        for opening in &transcript.openings {
+            openings.entry(opening.index).or_insert_with(|| opening.clone());
+        }
+    }
+    if openings.len() < profile.reconstruction_threshold_cells() {
+        return Err(DemoError::InsufficientCells);
+    }
+
+    let mut indices: Vec<_> = openings.keys().copied().collect();
+    indices.sort_unstable();
+    let symbol_indices: Vec<_> = indices
+        .iter()
+        .flat_map(|&index| (0..profile.c).map(move |offset| index * profile.c + offset))
+        .collect();
+    let decoder = ErasureDecoder::new(profile, &symbol_indices).ok_or(DemoError::ReconstructionFailed)?;
+    (0..profile.n)
+        .map(|row| {
+            let values: Vec<_> = indices
+                .iter()
+                .flat_map(|&index| {
+                    let opening = &openings[&index];
+                    (0..profile.c).map(move |offset| opening.cells[row][offset])
+                })
+                .collect();
+            decoder.reconstruct_blob(&values).ok_or(DemoError::ReconstructionFailed)
+        })
+        .collect()
+}
+
+/// Reports how many systematic positions can occur inside one cell.
+pub fn systematic_symbols_per_cell(profile: ParameterProfile) -> usize {
+    (0..profile.c)
+        .filter(|offset| offset.is_multiple_of(profile.systematic_stride()))
+        .count()
+}
diff --git a/crates/pq_das/zkdsl/main.py b/crates/pq_das/zkdsl/main.py
new file mode 100644
index 000000000..7b575cb43
--- /dev/null
+++ b/crates/pq_das/zkdsl/main.py
@@ -0,0 +1,102 @@
+from snark_lib import *
+
+DIM = 5
+DIGEST_LEN = 8
+N = N_PLACEHOLDER
+M = M_PLACEHOLDER
+K = K_PLACEHOLDER
+C = C_PLACEHOLDER
+N_CELLS = N_CELLS_PLACEHOLDER
+SYSTEMATIC_STRIDE = SYSTEMATIC_STRIDE_PLACEHOLDER
+ROW_CHUNKS = ROW_CHUNKS_PLACEHOLDER
+COLUMN_CHUNKS = COLUMN_CHUNKS_PLACEHOLDER
+MERKLE_DEPTH = MERKLE_DEPTH_PLACEHOLDER
+TREE_DIGESTS = TREE_DIGESTS_PLACEHOLDER
+LEVEL_SIZES = LEVEL_SIZES_PLACEHOLDER
+LEVEL_OFFSETS = LEVEL_OFFSETS_PLACEHOLDER
+
+PUBLIC_ROW_HASHES_PTR = PUBLIC_ROW_HASHES_PTR_PLACEHOLDER
+PUBLIC_ROOT_PTR = PUBLIC_ROOT_PTR_PLACEHOLDER
+CHECK_VECTOR_PTR = CHECK_VECTOR_PTR_PLACEHOLDER
+
+
+@inline
+# Constrains one extension element to be zero in all five coordinates.
+def assert_ext_zero(a):
+    for i in unroll(0, DIM):
+        assert a[i] == 0
+    return
+
+
+# Poseidon-hashes complete rate-eight blocks with a compact runtime loop.
+def hash_chunks(data, num_chunks: Const):
+    iv = Array(DIGEST_LEN)
+    iv[0] = num_chunks * DIGEST_LEN
+    for i in unroll(1, DIGEST_LEN):
+        iv[i] = 0
+
+    full = Array(2 * DIGEST_LEN)
+    if num_chunks == 1:
+        poseidon16_permute(iv, data, full)
+    else:
+        states = Array((num_chunks - 1) * DIGEST_LEN)
+        poseidon16_permute_half(iv, data, states)
+        for chunk in range(1, num_chunks - 1):
+            poseidon16_permute_half(
+                states + (chunk - 1) * DIGEST_LEN,
+                data + chunk * DIGEST_LEN,
+                states + chunk * DIGEST_LEN,
+            )
+        poseidon16_permute(
+            states + (num_chunks - 2) * DIGEST_LEN,
+            data + (num_chunks - 1) * DIGEST_LEN,
+            full,
+        )
+    return full + DIGEST_LEN
+
+
+# Checks only the construction's row hashes, column Merkle root, and RS identities.
+def main():
+    codewords = Array(N * M)
+    hint_witness("codewords", codewords)
+    public_row_hashes = PUBLIC_ROW_HASHES_PTR
+    public_root = PUBLIC_ROOT_PTR
+    check_vector = CHECK_VECTOR_PTR
+
+    for row in range(0, N):
+        systematic = Array(K)
+        for i in range(0, K):
+            systematic[i] = codewords[row * M + i * SYSTEMATIC_STRIDE]
+        digest = hash_chunks(systematic, ROW_CHUNKS)
+        for i in unroll(0, DIGEST_LEN):
+            assert digest[i] == public_row_hashes[row * DIGEST_LEN + i]
+
+    tree = Array(TREE_DIGESTS * DIGEST_LEN)
+    for cell in range(0, N_CELLS):
+        column_data = Array(N * C)
+        for row in range(0, N):
+            for offset in range(0, C):
+                column_data[row * C + offset] = codewords[row * M + cell * C + offset]
+        digest = hash_chunks(column_data, COLUMN_CHUNKS)
+        for i in unroll(0, DIGEST_LEN):
+            tree[cell * DIGEST_LEN + i] = digest[i]
+
+    for level in unroll(0, MERKLE_DEPTH):
+        input_offset = LEVEL_OFFSETS[level]
+        output_offset = LEVEL_OFFSETS[level + 1]
+        for node in range(0, LEVEL_SIZES[level + 1]):
+            poseidon16_compress_half(
+                tree + (input_offset + 2 * node) * DIGEST_LEN,
+                tree + (input_offset + 2 * node + 1) * DIGEST_LEN,
+                tree + (output_offset + node) * DIGEST_LEN,
+            )
+
+    root_offset = LEVEL_OFFSETS[MERKLE_DEPTH] * DIGEST_LEN
+    for i in unroll(0, DIGEST_LEN):
+        assert tree[root_offset + i] == public_root[i]
+
+    for row in range(0, N):
+        result = Array(DIM)
+        dot_product_be(codewords + row * M, check_vector, result, M)
+        assert_ext_zero(result)
+    return