Skip to content

Migrate to policyengine v4 and resolve runtime bundles#3487

Open
MaxGhenis wants to merge 9 commits intomasterfrom
migrate-to-policyengine-v4
Open

Migrate to policyengine v4 and resolve runtime bundles#3487
MaxGhenis wants to merge 9 commits intomasterfrom
migrate-to-policyengine-v4

Conversation

@MaxGhenis
Copy link
Copy Markdown
Collaborator

@MaxGhenis MaxGhenis commented Apr 21, 2026

Summary

  • Bumps the API toward the policyengine v4 line so the API can import the TRACE provenance helpers.
  • Replaces pre-v4 imports with API-local SimulationOptions and dataset helpers.
  • Adds policyengine_api.libs.runtime_bundle.resolve_runtime_bundle, which resolves the model version, data version, canonical HF dataset URI, worker dataset URI, dataset sha, package versions, and bundle fingerprint for a simulation request.
  • Preserves data_version when sending payloads to Modal instead of dropping it.
  • Resolves the Modal simulation-worker app name before creating new economy jobs, then includes that resolved worker identity in the options hash/cache lookup.
  • Adds the resolved policyengine_bundle to economy-service metadata, _runtime_bundle, options hashing, and trace-candidate telemetry.

Modal worker status

Verification

  • FLASK_DEBUG=1 uv run ruff check policyengine_api/libs/runtime_bundle.py policyengine_api/libs/simulation_types.py policyengine_api/libs/simulation_api_modal.py policyengine_api/services/economy_service.py tests/unit/libs/test_runtime_bundle.py tests/unit/libs/test_simulation_types.py tests/unit/libs/test_simulation_api_modal.py tests/unit/services/test_economy_service.py tests/fixtures/services/economy_service.py
  • FLASK_DEBUG=1 uv run pytest tests/unit/libs/test_runtime_bundle.py tests/unit/libs/test_simulation_types.py tests/unit/libs/test_simulation_api_modal.py tests/unit/services/test_economy_service.py -q (130 passed, 2 warnings)
  • GitHub Actions: Lint, changelog, Docker, env tests, full Test, Codecov project, and Codecov patch are green.
  • policyengine-api-v2 Update PolicyEngine US to 0.329.1 #469 local worker tests: 173 passed, 1 deselected, 1 warning.
  • policyengine-api-v2 main deploy run 24959697217: staging and production deploy/integration jobs succeeded.
  • Production Modal smoke against version=1.653.3, data_version=1.87.0, and _runtime_bundle: job fc-01KQ566MSKNJRJ2C5SCGNNGQZM completed with budget, poverty, and inequality results.

Follow-up

Fixes #3486. Unblocks #3485.

MaxGhenis and others added 2 commits April 21, 2026 15:11
Prerequisite for #3485 (webapp TRACE TRO
emission). The v4 provenance primitives live in
policyengine.provenance.trace.* which only ship on the 4.x line; the
api was pinned to 0.x and could not use them.

Changes:
- pyproject.toml: bump policyengine >0.12.0,<1 -> >=4.3.1,<5;
  policyengine_us 1.634.9 -> 1.653.3; policyengine_uk 2.78.0 -> 2.88.0.
- Remove two imports that relied on the pre-v4 orchestrator:
    from policyengine.simulation import SimulationOptions
    from policyengine.utils.data.datasets import get_default_dataset
- Replace with api-local equivalents in a new
  policyengine_api.libs.simulation_types module:
    * SimulationOptions: Pydantic model matching the Modal simulation
      worker's existing JSON wire contract. Owning the type api-side
      decouples the api from pe.py's internal class layout.
    * get_default_dataset(country_id, region): api-local helper
      returning the default HF-hosted dataset URI for (country, region).
      State / district regions fall back to the national default,
      matching prior behavior.
- 9 smoke tests for the new module. Tests pass in isolation; the
  api's full pytest suite requires CloudSQL credentials to import,
  so end-to-end integration verification lands in CI.

Draft status: this PR ships the import-level migration and the type
contract. It does NOT exercise the full v4 code paths in production,
because the Modal simulation-worker side also pins pe.py — whoever
merges this should coordinate with Modal-side version alignment.
After that, the follow-up to #3485 becomes tractable: the api can
call policyengine.provenance.trace.build_trace_tro_from_release_bundle
to emit institutionally-signed TROs for every simulation run.

Related:
- #3485 (webapp TRO emission — unblocked by this)
- #3486 (this migration — issue scoped)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex review caught two real bugs in the first pass of this PR:

1. The UK default URI pointed at the wrong dataset (enhanced_frs_2022_23
   in the public bucket), while both the pre-v4 contract and the
   canonical policyengine.py release manifest expect
   enhanced_frs_2023_24 in the -private bucket (UKDS-licensed data).
   Shipping as-was would have broken every UK webapp run.

2. US state / congressional-district / place regions were collapsed to
   the national default, contradicting the existing test contract in
   tests/unit/services/test_economy_service.py:1353,1371 which requires
   per-state (states/CA.h5), per-district (districts/CA-37.h5), and
   place-to-parent-state (place/NJ-57000 -> states/NJ.h5) routing. This
   would have broken per-state and per-district economy-wide runs.

Both classes of bug surface as provenance drift: the api metadata
would show one dataset, the worker would run against another, and
any TRO emitted on top of that would still look authoritative.

The rewritten get_default_dataset faithfully ports the pre-v4 GCS-URI
contract (gs://policyengine-{country}-data{-private}/...) to the
api-local module. State codes and district IDs are upper-cased;
place regions parse the NJ-57000 shape and reuse the parent state.
Unknown regions raise rather than silently falling through.

Tests updated to match the real contract instead of the earlier
naive fallback assertions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis force-pushed the migrate-to-policyengine-v4 branch from 6da4f35 to 0be9e1a Compare April 21, 2026 20:30
@MaxGhenis MaxGhenis changed the title Migrate from policyengine v0.x to v4 (prerequisite for TRACE TRO emission) Migrate to policyengine v4 and resolve runtime bundles Apr 26, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 26, 2026

Codecov Report

❌ Patch coverage is 91.37931% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.57%. Comparing base (8a7c1df) to head (27e4db9).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
policyengine_api/libs/runtime_bundle.py 88.88% 7 Missing and 5 partials ⚠️
policyengine_api/services/economy_service.py 86.36% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3487      +/-   ##
==========================================
+ Coverage   77.00%   77.57%   +0.57%     
==========================================
  Files          63       65       +2     
  Lines        3418     3581     +163     
  Branches      617      642      +25     
==========================================
+ Hits         2632     2778     +146     
- Misses        612      621       +9     
- Partials      174      182       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate policyengine dep from 0.x to v4 (prereq for TRACE TRO emission)

1 participant