Migrate to policyengine v4 and resolve runtime bundles#3487
Open
Migrate to policyengine v4 and resolve runtime bundles#3487
Conversation
Prerequisite for #3485 (webapp TRACE TRO emission). The v4 provenance primitives live in policyengine.provenance.trace.* which only ship on the 4.x line; the api was pinned to 0.x and could not use them. Changes: - pyproject.toml: bump policyengine >0.12.0,<1 -> >=4.3.1,<5; policyengine_us 1.634.9 -> 1.653.3; policyengine_uk 2.78.0 -> 2.88.0. - Remove two imports that relied on the pre-v4 orchestrator: from policyengine.simulation import SimulationOptions from policyengine.utils.data.datasets import get_default_dataset - Replace with api-local equivalents in a new policyengine_api.libs.simulation_types module: * SimulationOptions: Pydantic model matching the Modal simulation worker's existing JSON wire contract. Owning the type api-side decouples the api from pe.py's internal class layout. * get_default_dataset(country_id, region): api-local helper returning the default HF-hosted dataset URI for (country, region). State / district regions fall back to the national default, matching prior behavior. - 9 smoke tests for the new module. Tests pass in isolation; the api's full pytest suite requires CloudSQL credentials to import, so end-to-end integration verification lands in CI. Draft status: this PR ships the import-level migration and the type contract. It does NOT exercise the full v4 code paths in production, because the Modal simulation-worker side also pins pe.py — whoever merges this should coordinate with Modal-side version alignment. After that, the follow-up to #3485 becomes tractable: the api can call policyengine.provenance.trace.build_trace_tro_from_release_bundle to emit institutionally-signed TROs for every simulation run. Related: - #3485 (webapp TRO emission — unblocked by this) - #3486 (this migration — issue scoped) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex review caught two real bugs in the first pass of this PR:
1. The UK default URI pointed at the wrong dataset (enhanced_frs_2022_23
in the public bucket), while both the pre-v4 contract and the
canonical policyengine.py release manifest expect
enhanced_frs_2023_24 in the -private bucket (UKDS-licensed data).
Shipping as-was would have broken every UK webapp run.
2. US state / congressional-district / place regions were collapsed to
the national default, contradicting the existing test contract in
tests/unit/services/test_economy_service.py:1353,1371 which requires
per-state (states/CA.h5), per-district (districts/CA-37.h5), and
place-to-parent-state (place/NJ-57000 -> states/NJ.h5) routing. This
would have broken per-state and per-district economy-wide runs.
Both classes of bug surface as provenance drift: the api metadata
would show one dataset, the worker would run against another, and
any TRO emitted on top of that would still look authoritative.
The rewritten get_default_dataset faithfully ports the pre-v4 GCS-URI
contract (gs://policyengine-{country}-data{-private}/...) to the
api-local module. State codes and district IDs are upper-cased;
place regions parse the NJ-57000 shape and reuse the parent state.
Unknown regions raise rather than silently falling through.
Tests updated to match the real contract instead of the earlier
naive fallback assertions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6da4f35 to
0be9e1a
Compare
…gine-v4 # Conflicts: # pyproject.toml
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3487 +/- ##
==========================================
+ Coverage 77.00% 77.57% +0.57%
==========================================
Files 63 65 +2
Lines 3418 3581 +163
Branches 617 642 +25
==========================================
+ Hits 2632 2778 +146
- Misses 612 621 +9
- Partials 174 182 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
policyenginev4 line so the API can import the TRACE provenance helpers.SimulationOptionsand dataset helpers.policyengine_api.libs.runtime_bundle.resolve_runtime_bundle, which resolves the model version, data version, canonical HF dataset URI, worker dataset URI, dataset sha, package versions, and bundle fingerprint for a simulation request.data_versionwhen sending payloads to Modal instead of dropping it.policyengine_bundleto economy-service metadata,_runtime_bundle, options hashing, and trace-candidate telemetry.Modal worker status
24959697217passed staging deploy, staging live simulation tests, production deploy, and production integration tests on April 26, 2026./versionsnow registers the exact worker this PR needs:policyengine-simulation-us1-653-3-uk2-88-0forpolicyengine-us==1.653.3andpolicyengine-uk==2.88.0.Verification
FLASK_DEBUG=1 uv run ruff check policyengine_api/libs/runtime_bundle.py policyengine_api/libs/simulation_types.py policyengine_api/libs/simulation_api_modal.py policyengine_api/services/economy_service.py tests/unit/libs/test_runtime_bundle.py tests/unit/libs/test_simulation_types.py tests/unit/libs/test_simulation_api_modal.py tests/unit/services/test_economy_service.py tests/fixtures/services/economy_service.pyFLASK_DEBUG=1 uv run pytest tests/unit/libs/test_runtime_bundle.py tests/unit/libs/test_simulation_types.py tests/unit/libs/test_simulation_api_modal.py tests/unit/services/test_economy_service.py -q(130 passed, 2 warnings)173 passed, 1 deselected, 1 warning.24959697217: staging and production deploy/integration jobs succeeded.version=1.653.3,data_version=1.87.0, and_runtime_bundle: jobfc-01KQ566MSKNJRJ2C5SCGNNGQZMcompleted withbudget,poverty, andinequalityresults.Follow-up
Fixes #3486. Unblocks #3485.