Skip to content

Handle unpopulated ni_class_3 target #378

@MaxGhenis

Description

@MaxGhenis

Problem

policyengine-uk-data parses an OBR target for Class 3 National Insurance contributions and maps it to the PolicyEngine UK variable ni_class_3, but the datasets do not appear to populate ni_class_3.

Current code paths found:

  • policyengine_uk_data/targets/sources/obr.py maps OBR “Class 3 NICs” / “Class 3 Voluntary NICs” rows to ni_class_3.
  • policyengine_uk_data/tests/test_obr_nics.py tests that mapping.
  • No dataset builder, imputation, calibration, or utility code path appears to assign ni_class_3 outside targets/tests.

Because ni_class_3 is an input in policyengine-uk, simulations use the default value when datasets omit it. That means an OBR target for ni_class_3 is not currently matchable by reweighting or calibration.

Why this matters

This makes the NI target set look broader than the data can support. It can also make calibration diagnostics misleading: Class 3 NICs are targeted, but no record-level variation exists unless a dataset explicitly stores ni_class_3.

Suggested resolution

One of:

  1. Populate ni_class_3 from a defensible data source or imputation, then keep the OBR target active.
  2. If no defensible source exists, exclude the ni_class_3 OBR target from calibration/loss until the dataset populates it.
  3. Add a guard test that target variables intended for calibration are either formula-computed from populated inputs or explicitly present/populated in the relevant dataset.

Related: #88 tracks National Insurance targeting more broadly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions