Skip to content

fix: import scikit-learn lazily in openstef-beam r2 metric#981

Open
Valyrian-Code wants to merge 4 commits into
OpenSTEF:mainfrom
Valyrian-Code:fix/beam-lazy-sklearn-805
Open

fix: import scikit-learn lazily in openstef-beam r2 metric#981
Valyrian-Code wants to merge 4 commits into
OpenSTEF:mainfrom
Valyrian-Code:fix/beam-lazy-sklearn-805

Conversation

@Valyrian-Code

Copy link
Copy Markdown
Contributor

What

scikit-learn is an optional dependency of openstef-beam (it is not declared in the package's dependencies), but metrics_deterministic.py imported r2_score from sklearn.metrics at module top. As a result, importing openstef_beam.metrics (or anything that pulls it in) failed with ModuleNotFoundError when scikit-learn was not installed. Resolves #805.

Fix

Move the from sklearn.metrics import r2_score import inside the r2() function, its only use site. Importing the module no longer requires scikit-learn; only calling r2() does.

Tests

  • test_r2_score_is_imported_lazily: asserts r2_score is not bound at module level (locks the lazy import).
  • test_r2_still_computes_with_sklearn_available: confirms r2() still computes correctly.

ty check, ruff check, ruff format --check, the module doctests, and the full beam metrics suite all pass.

scikit-learn is an optional dependency of openstef-beam (not declared in its
dependencies), but metrics_deterministic imported r2_score at module top, so
importing openstef_beam.metrics failed when scikit-learn was not installed.

Move the import inside r2() (its only use site) so the metrics module imports
without scikit-learn; only calling r2() needs it. Add regression tests.

Signed-off-by: RAJVEER42 <irajveer.bishnoi2310@gmail.com>
@Valyrian-Code Valyrian-Code requested a review from a team June 19, 2026 17:55

@egordm egordm left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this one. R2 is used as default in many openstef components. Making the import lazy would notify users only at runtime that they are missing a dependency.

I don't think making it lazy is the right approach.

It is really tempting to inline R2 though, which would remove openstef-beam's dependency on sklearn.

Per @egordm's review on OpenSTEF#981: instead of lazily importing sklearn's r2_score,
compute R2 directly with numpy so openstef-beam no longer depends on
scikit-learn at all. Matches scikit-learn's r2_score including sample weights
and the constant-y_true convention (1.0 for a perfect fit, else 0.0).

Signed-off-by: RAJVEER42 <irajveer.bishnoi2310@gmail.com>
@Valyrian-Code

Copy link
Copy Markdown
Contributor Author

Good call, thanks! I've inlined the R2 computation in numpy, so openstef-beam no longer depends on scikit-learn at all rather than just deferring the import to runtime. It matches scikit-learn's r2_score including sample weights and the constant-y_true convention (1.0 for a perfect fit, else 0.0), and I added value tests to lock that down. Also merged latest main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[OpenSTEF 4.0] Fix optional dependencies stef-beam

2 participants