Skip to content

chore(typecheck): clean datasets/eval/export/inference; enforce mypy on 12 packages#834

Open
xieofxie wants to merge 5 commits into
mainfrom
hualxie/typing_3
Open

chore(typecheck): clean datasets/eval/export/inference; enforce mypy on 12 packages#834
xieofxie wants to merge 5 commits into
mainfrom
hualxie/typing_3

Conversation

@xieofxie

@xieofxie xieofxie commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Extend the strict-mypy required-CI gate from 8 → 12 packages by cleaning datasets/, eval/, export/, and inference/ (189 errors → 0). Also restructures the dispatcher pattern in config/build.py to fix a regression timenick flagged on #825 (kwarg type-checking lost behind **dict[str, Any]).

What's in this PR

Dispatcher / overload cleanup (config/build.py, commands/config.py)

  • generate_hf_build_config: added a third @overload accepting module: str | None and returning the union of the two narrow returns. Lets the generate_build_config dispatcher collapse to a single call (no kwarg duplication, no Any widening, no cast).
  • commands.config command: ripped out the _shared_kwargs: dict[str, Any] = {...} workaround and replaced the if module: discriminator with isinstance(result, list) narrowing. Restores full call-site type checking against the callee's signature.

Cleaning 4 new folders (189 → 0)

Notable source-level fixes that cascaded:

  • core/time_utils.format_timestamp_iso — added @overloads so float → str and None → None propagate; cleared 3 downstream [arg-type] chains.
  • export/htp/base_writer.step decorator — replaced -> Any factory with Callable[[F], F], dropped no-op @wraps wrapper. One source fix cleared 18 cascading [untyped-decorator] errors across writers.
  • export/htp/exporter — typed the None-init core fields with proper Optional annotations + single non-None assert in _apply_hierarchy_tags; cleared the union-attr chain.
  • datasets/* — common pattern: class-level _image_col: str = "" default instead of runtime None-init in _detect_columns; eliminates Any | None property-return errors across image/segmentation/detection/depth datasets.
  • eval/*BaseImageProcessor | None guards across evaluators; pipeline() call-overload suppressions for runtime task strings that can't be statically matched against transformers' 60+ Literal overloads.

Configuration

  • pyproject.toml ignore_missing_imports: added soundfile, sklearn.*, evaluate(.*).
  • Dev deps: added scipy-stubs, types-psutil.
  • .github/workflows/lint.yml: switched dependency install to uv sync --all-extras --all-groups so type stubs in [dependency-groups] dev are present in CI. Extended the required mypy loop from 8 to 12 packages.

Tracked follow-ups

🤖 Generated with Claude Code

# Generate config(s). The ``module: str | None`` overload of
# generate_hf_build_config returns WinMLBuildConfig | list[...],
# which isinstance(result, list) narrows for the branches below.
result = generate_hf_build_config(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# overload, which returns WinMLBuildConfig | list[WinMLBuildConfig] — matching
# this dispatcher's implementation return type. The dispatcher's own
# narrowing overloads above still tighten the return type for its callers.
return generate_hf_build_config(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert as comment in #825

Comment thread src/winml/modelkit/config/build.py Dismissed
…on 12 packages

- Clear all mypy errors across the 4 folders (189 → 0). Notable
  source-level fixes that cleared cascades:
  - core/time_utils.format_timestamp_iso: added @overload variants so
    float→str and None→None propagate to all 4 export call sites.
  - export/htp/base_writer.step: replaced `-> Any` decorator factory
    with `Callable[[F], F]` and dropped the no-op @wraps wrapper —
    one source fix cleared 18 [untyped-decorator] errors across
    console_writer / markdown_report_writer / metadata_writer.
  - export/htp/exporter: typed the `None`-init core fields with
    proper Optional class annotations and a single non-None assert in
    `_apply_hierarchy_tags` cleared the union-attr chain.
  - export/htp/hierarchy.trace_model_execution: widened the
    example_inputs param to tuple|dict (callee already handled both
    via isinstance).
- Added missing third-party stubs / ignores in pyproject.toml:
  - dev dep: scipy-stubs, types-psutil.
  - ignore_missing_imports: soundfile, sklearn.*, evaluate(.*).
- Extended .github/workflows/lint.yml required mypy loop from 8 to
  12 packages. New errors in any of the listed folders now block
  PRs.
- Silenced two unguarded `winml.modelkit.inference.onnx_config.*`
  imports in export/htp/config_generator.py with
  `# type: ignore[import-not-found]` referencing #859, which tracks
  whether to restore or delete the dead integration.
Comment thread src/winml/modelkit/core/time_utils.py Dismissed
Comment thread src/winml/modelkit/core/time_utils.py Dismissed
Comment thread src/winml/modelkit/core/time_utils.py Dismissed
@xieofxie xieofxie changed the title chore: typechecking for .. chore(typecheck): clean datasets/eval/export/inference; enforce mypy on 12 packages Jun 11, 2026
Comment thread src/winml/modelkit/datasets/processor_utils.py
@xieofxie xieofxie marked this pull request as ready for review June 11, 2026 06:00
@xieofxie xieofxie requested a review from a team as a code owner June 11, 2026 06:00

@DingmaomaoBJTU DingmaomaoBJTU left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type-checking PR review

Overall: Clean, well-scoped PR. The approach of progressively moving packages into the required mypy gate is sound. Found one structural correctness issue, two warnings about suppressions masking real problems, and a few low-priority notes.

Bug - dead runtime guard in tensor_similarity_evaluator.py

mypy proves the isinstance(model, WinMLCompositeModel) branch is unreachable given the function's declared parameter type. The two # type: ignore[unreachable] suppressions silence the warning and make the guard invisible to callers. Either widen the parameter type to WinMLPreTrainedModel | WinMLCompositeModel so the check is live, or remove it and document that composite models are excluded at the call site.

Warning - cast admits subtype incompatibility (evaluate.py)

cast("str | dict[str, str | Path]", config.model_path) with a comment admitting it is "subtype-incompatible-but-runtime-compatible" suppresses a genuine type mismatch. The right fix is to widen WinMLEvaluationConfig.model_path's declared type to match what WinMLAutoModel.from_onnx actually accepts, rather than casting at each call site.

Warning - dead try-branches in config_generator.py

Neither inference.onnx_config.input_generator nor inference.onnx_config.patterns exists at runtime. Both try-blocks always fall through to their fallback paths. The # type: ignore[import-not-found] suppresses mypy but the code inside the try-blocks is dead. Consider removing the unreachable import bodies entirely (leaving only the fallback logic) until issue #859 resolves the module's fate.

Warning - duplicate if TYPE_CHECKING blocks (pipeline.py)

Two separate if TYPE_CHECKING: blocks appear at lines 26 and 30. They should be merged into one block. As-is it reads as if the second was added without noticing the first.

Info - class-level empty-string sentinel (datasets/image.py)

Replacing None with "" as the pre-detection sentinel is fine since all checks use if not self._image_col: consistently. Worth noting: a subclass that accidentally skips _detect_columns() will inherit the class-level "" and fail with KeyError: '' rather than a more debuggable AttributeError. Not a blocker.

Info - @step decorator TypeVar bound (base_writer.py)

F = TypeVar("F", bound="Callable[..., int]") is stricter than the old wrapper. Any @step-decorated handler that does not return int will now get a mypy error. Confirm all existing handlers annotate -> int; if any return None, they'll surface as new failures once dependent packages are gated.

@zhenchaoni

Copy link
Copy Markdown
Member

Heads-up on the dict(processor(...)) / dict(tokenizer(...)) wrappers added in datasets/{depth_estimation,image,object_detection,text,image_segmentation}.py:

These shallow-copy the returned BatchEncoding into a plain dict, which drops BatchEncoding-specific methods like .to(device) and the .encodings property. Should be fine here since the result is immediately consumed by dataset.map (which plain-dictifies anyway), but it's the one change in this PR with non-trivial runtime surface area. A single smoke run through the datasets path would confirm nothing downstream was relying on the BatchEncoding type.

@xieofxie

Copy link
Copy Markdown
Contributor Author

Type-checking PR review

Overall: Clean, well-scoped PR. The approach of progressively moving packages into the required mypy gate is sound. Found one structural correctness issue, two warnings about suppressions masking real problems, and a few low-priority notes.

Bug - dead runtime guard in tensor_similarity_evaluator.py

mypy proves the isinstance(model, WinMLCompositeModel) branch is unreachable given the function's declared parameter type. The two # type: ignore[unreachable] suppressions silence the warning and make the guard invisible to callers. Either widen the parameter type to WinMLPreTrainedModel | WinMLCompositeModel so the check is live, or remove it and document that composite models are excluded at the call site.

Warning - cast admits subtype incompatibility (evaluate.py)

cast("str | dict[str, str | Path]", config.model_path) with a comment admitting it is "subtype-incompatible-but-runtime-compatible" suppresses a genuine type mismatch. The right fix is to widen WinMLEvaluationConfig.model_path's declared type to match what WinMLAutoModel.from_onnx actually accepts, rather than casting at each call site.

Warning - dead try-branches in config_generator.py

Neither inference.onnx_config.input_generator nor inference.onnx_config.patterns exists at runtime. Both try-blocks always fall through to their fallback paths. The # type: ignore[import-not-found] suppresses mypy but the code inside the try-blocks is dead. Consider removing the unreachable import bodies entirely (leaving only the fallback logic) until issue #859 resolves the module's fate.

Warning - duplicate if TYPE_CHECKING blocks (pipeline.py)

Two separate if TYPE_CHECKING: blocks appear at lines 26 and 30. They should be merged into one block. As-is it reads as if the second was added without noticing the first.

Info - class-level empty-string sentinel (datasets/image.py)

Replacing None with "" as the pre-detection sentinel is fine since all checks use if not self._image_col: consistently. Worth noting: a subclass that accidentally skips _detect_columns() will inherit the class-level "" and fail with KeyError: '' rather than a more debuggable AttributeError. Not a blocker.

Info - @step decorator TypeVar bound (base_writer.py)

F = TypeVar("F", bound="Callable[..., int]") is stricter than the old wrapper. Any @step-decorated handler that does not return int will now get a mypy error. Confirm all existing handlers annotate -> int; if any return None, they'll surface as new failures once dependent packages are gated.

Fixed and one issue created for you #859

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants