Skip to content

Add _build_sparse_constraint_system for O(nnz) calibration build#7

Merged
MaxGhenis merged 1 commit intomainfrom
sparse-constraint-builder
Apr 19, 2026
Merged

Add _build_sparse_constraint_system for O(nnz) calibration build#7
MaxGhenis merged 1 commit intomainfrom
sparse-constraint-builder

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

Avoid dense materialization when building the calibration linear system. At microplex-us v7 scale (~1.5M records × ~4k constraints, mostly marginal indicators that are 95%+ zero), the existing _build_linear_constraint_system allocates a dense numpy array of ~24 GB of which only ~100-500 MB is non-zero. Downstream L0 calibrators immediately convert to CSR anyway — the dense intermediate is waste.

Downstream evidence: on 2026-04-18 at 23:40:50, macOS memorystatus killed the microplex-us pipeline mid-calibration as python3.14 [28015] 172343 MB (compressed). Root cause was this dense build path, not microcalibrate's own internals.

What changes

New sibling of _build_linear_constraint_system in microplex.calibration:

  • _build_sparse_constraint_system(data, marginal_targets, continuous_targets, linear_constraints) -> (X_csr, b, names, n_categorical)

Builds the matrix row-by-row via (indptr, indices, data) construction. For each marginal category, stores only the np.flatnonzero(column == category) entries with value 1.0. Continuous columns and LinearConstraint coefficients are stored only at their nonzero positions.

Equivalence guarantee

_build_sparse_constraint_system(...).toarray() == _build_linear_constraint_system(...)[0] up to float64 rounding. Tests in tests/test_sparse_constraint_system.py pin this on three fixtures (marginal-only, continuous+linear+marginal, high-cardinality marginal for density check).

Test plan

  • pytest tests/test_sparse_constraint_system.py — 4 new tests pass.
  • pytest tests/test_calibration.py tests/test_microcalibrate_adapter.py — 36 existing tests unchanged.
  • Downstream: microplex-us's PolicyEngineL0Calibrator swapped to call the sparse builder directly; 2 existing regression tests still pass (test_pe_l0.py).

Why sparse-native and not sparse-coercion

sp.csr_matrix(dense_array) would still require the dense array to exist first. The point is to avoid allocating 24 GB in the first place; building directly into CSR storage does that.

🤖 Generated with Claude Code

The existing _build_linear_constraint_system materializes a dense
numpy array of shape (n_targets, n_records) via np.vstack(rows). At
microplex-us v7 scale (~1.5M records × ~4k constraints, mostly
marginal indicators that are 95%+ zero), this is ~24 GB of dense
float64 allocated just to represent ~100-500 MB of nonzero data.

The pipeline's L0 calibrator (microplex_us.pipelines.pe_l0) wraps the
dense matrix immediately in sp.csr_matrix(A) — it already wants
sparse, we just got there through a dense intermediate that wastes
the memory and caused the OOM that macOS memorystatus killed as the
"172 GB compressed process" in the v7 rerun on 2026-04-18.

This commit adds a sparse-native builder that produces a CSR matrix
row-by-row via (indptr, indices, data) construction, never allocating
the full dense intermediate:

- Marginal targets: each category produces a CSR row by
  np.flatnonzero(column == category), storing only the matching row
  indices with value 1.0.
- Continuous targets: flatnonzero on the column values, storing only
  nonzero entries.
- LinearConstraint rows: flatnonzero on coefficients, same idea.

Semantics match _build_linear_constraint_system exactly: both return
(matrix, b, names, n_categorical); the sparse version returns a CSR
whose .toarray() equals the dense version's A (up to float64
rounding).

Tests in tests/test_sparse_constraint_system.py pin:
1. _build_sparse_constraint_system importable from microplex.calibration.
2. Sparse output == dense output for marginal-only problem.
3. Sparse output == dense output for mixed marginal + continuous +
   LinearConstraint problem.
4. Actual sparsity: density < 0.45 on a realistic 4-state × 3-age
   marginal problem (the point of the refactor).

36 existing calibration tests unchanged; 40 total now pass.

Downstream wiring: microplex-us.pipelines.pe_l0.PolicyEngineL0Calibrator
calls this directly in a companion commit, bypassing the dense path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis merged commit 5c358e3 into main Apr 19, 2026
3 checks passed
@MaxGhenis MaxGhenis deleted the sparse-constraint-builder branch April 19, 2026 12:12
MaxGhenis added a commit to CosilicoAI/microplex-us that referenced this pull request Apr 22, 2026
Two linked changes:

1. pe_l0.py: PolicyEngineL0Calibrator.fit now calls
   _build_sparse_constraint_system from microplex.calibration directly,
   skipping the dense np.vstack + sp.csr_matrix(A) round-trip. At v7
   scale (1.5M records × ~4k constraints) this avoids the ~24 GB dense
   intermediate that macOS memorystatus killed the v7 microcalibrate
   rerun over on 2026-04-18 (python3.14 [28015] grew to 172 GB
   compressed). Requires microplex from the sparse-constraint-builder
   branch (CosilicoAI/microplex#7). Residual computation also switched
   from `A @ weights - b` to `X_sparse @ weights - b`; identical
   numerics, no dense matrix ever materialized.

2. paper/index.qmd §3.3 / §3.4: weaken the identity-preservation
   definition from strict positivity (∀i: w_i' > 0) to row-set
   preservation (∀i: w_i' >= 0 AND id(r_i') = id(r_i)). Max's point
   in conversation: a record with w_i = 0 still has its entity
   identifier and row position in the HDF5 dataset — it's just
   excluded from the current year's weighted aggregates, and is
   available for year Y+1's calibration to re-weight up. This is
   consistent with CBOLT / DYNASIM's equal-per-person frozen-weight
   convention; zero-sparsity is a strict superset of that flexibility.

   §3.4 (Sparse L0) rewritten accordingly: L0 is now framed as a
   first-class calibrator alongside chi-squared, not as "optional
   post-processing." Both backends are identity-preserving under the
   corrected definition. The chi-squared vs L0 trade-off is now
   "deployment artifact size vs rare-subpopulation coverage audit
   burden" rather than "identity vs size."

Consequence for v8: the pe_l0 backend is now recommended for
memory-constrained runs on the 48 GB workstation. Next launch should
use --calibration-backend pe_l0 alongside --donor-imputer-backend zi_qrf
(see docs/next-run-plan.md).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant