Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 42 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,45 @@ If `../policyengine-uk` exists, you can run:
```sh
make data-local
```


## Public UK calibrated transfer dataset

This repo now also includes a public calibrated microdata file:

- `policyengine_uk_data/storage/enhanced_cps_2025.h5`
- source manifest: `policyengine_uk_data/storage/enhanced_cps_source_2025.csv`

The public UK calibrated transfer dataset starts from a public export of eligible households from
PolicyEngine-US Enhanced CPS. In the current build that source manifest contains
`28,532` households, not `1,000`. The pipeline maps those records into a
`UKSingleYearDataset`, aligns core UK-facing inputs such as council tax bands,
vehicle ownership, pensions, disability/PIP, consumption, and capital gains,
and then recalibrates the household weights against the UK national/region/country
target registry used by the loss pipeline.

The checked-in 2025 artifact uses a pinned `0.759` USD-to-GBP conversion rate
from the IRS 2025 yearly average exchange-rate table. The builder intentionally
does not call a live foreign-exchange API, because the committed H5 should be
reproducible from versioned code and source data. Pass `exchange_rate=...` to
`create_enhanced_cps` or `save_enhanced_cps` when rebuilding with a different
conversion assumption.

The calibration step is tested against the native 2025 loss matrix and should
reduce mean absolute relative error relative to the raw transfer weights. Report
full-artifact loss values from a named release artifact rather than copying them
into this README.

This is a public calibrated dataset, not a replacement for the FRS or enhanced
FRS. It is intended as the first step in a broader cross-country public-microdata
strategy.

Programmatic entrypoints:

- `policyengine_uk_data.datasets.create_enhanced_cps`
- `policyengine_uk_data.datasets.export_enhanced_cps_source`
- `policyengine_uk_data.datasets.save_enhanced_cps`

Backward-compatible aliases remain available:

- `policyengine_uk_data.datasets.create_policybench_transfer`
- `policyengine_uk_data.datasets.save_policybench_transfer`
1 change: 1 addition & 0 deletions changelog.d/287.added.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added a public `enhanced_cps_2025` dataset constructor that maps a public export of eligible households from PolicyEngine-US Enhanced CPS into a `UKSingleYearDataset` and recalibrates household weights against the UK national/region/country target registry. Backward-compatible `policybench_transfer` aliases remain available.
1 change: 1 addition & 0 deletions changelog.d/321.fixed.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fixed the public enhanced CPS transfer builder to write valid PolicyEngine UK input leaves instead of prompt-only aggregate fields.
27 changes: 27 additions & 0 deletions policyengine_uk_data/datasets/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
from .enhanced_cps import (
ENHANCED_CPS_FILE,
ENHANCED_CPS_SOURCE_FILE,
create_enhanced_cps,
export_enhanced_cps_source,
save_enhanced_cps,
)
from .frs import create_frs
from .policybench_transfer import (
POLICYBENCH_TRANSFER_SOURCE_FILE,
create_policybench_transfer,
save_policybench_transfer,
)
from .spi import create_spi

__all__ = [
"ENHANCED_CPS_FILE",
"ENHANCED_CPS_SOURCE_FILE",
"create_enhanced_cps",
"export_enhanced_cps_source",
"POLICYBENCH_TRANSFER_SOURCE_FILE",
"create_frs",
"create_policybench_transfer",
"create_spi",
"save_enhanced_cps",
"save_policybench_transfer",
]
Loading
Loading