fix(scoring): document confidence scaling, sort by severity within rule bucket by mimran-khan · Pull Request #153 · NVIDIA/SkillSpector

mimran-khan · 2026-06-22T20:03:26Z

Summary

Follow-up to #139 addressing non-blocking review feedback from @rng1995:

Confidence scaling documented — _compute_risk_score docstring now explicitly states that each finding's contribution is scaled by its confidence value (clamped to [0,1]) and that zero-confidence findings are skipped without consuming a weight slot.
Severity sort within rule bucket — Findings are now sorted by (rule_id, severity DESC) before applying diminishing weights, ensuring the highest-severity occurrence of each rule always receives the full weight regardless of input order.

Test plan

All 38 existing test_report.py tests pass (scoring, bands, edge cases, real-world scenarios)
test_same_rule_mixed_severities validates the sort ensures CRITICAL gets full weight

Closes review feedback on #139.

keshprad

Requesting changes for regression coverage: the implementation behavior looks correct from local validation, but this PR should add or adjust a test for the order-dependent case it fixes.

The existing test_same_rule_mixed_severities has CRITICAL before LOW, so it would pass on the base branch too. Please add a [LOW, CRITICAL] case for the same rule_id asserting score 52, which directly covers the new severity sort behavior.

rng1995

Re-review

The implementation is correct — sorting findings by (rule_id, severity_rank) before applying diminishing weights ensures the highest-severity occurrence of each rule gets full weight, and the docstring updates are clear.

But @keshprad's requested regression test is still missing — no new commit since the review (head is still 6c9287e1, and the diff only touches report.py). As noted there, the existing test_same_rule_mixed_severities orders CRITICAL before LOW, so it would also pass on the base branch and doesn't actually exercise the new sort. Keeping this at request changes pending:

A same-rule_id case in [LOW, CRITICAL] input order asserting the sorted score (~52). Without the sort that input scores lower (LOW full + CRITICAL half-weighted), so the case directly locks in the fix.

rng1995

Re-review — implementation is correct, but the requested regression test is still missing

Head is unchanged since @keshprad's review (6c9287e1); the diff is still report.py only (+14/-1) with no test changes. Re-requesting changes for the same reason. Credit to @keshprad for the catch.

The implementation itself is correct: sorting by (rule_id, severity_rank) before applying the diminishing per-rule weights ensures the highest-severity occurrence in each bucket always receives the full weight, and the confidence-scaling docstring is a useful clarification.

The gap is regression coverage of the behavior this PR fixes. The existing test_same_rule_mixed_severities lists CRITICAL before LOW, so it already passes on the base branch (no sort) and does not exercise the new ordering. Please add a same-rule_id [LOW, CRITICAL] case (input order LOW first) asserting the sorted score of 52:

With the sort: CRITICAL receives the full weight and LOW the diminished weight -> 52.
Without the sort (base branch): LOW would take full weight and CRITICAL the half weight -> a lower score.

That input-order-sensitive case is exactly what locks in the fix and guards against a future regression. Once it's added I'm happy to approve.

mimran-khan · 2026-06-23T08:58:43Z

Thanks for review, will work on this and add it.

…le bucket Addresses non-blocking review feedback: 1. Docstring now explicitly documents that each finding's contribution is scaled by its confidence value, and that zero-confidence findings are skipped without consuming a weight slot. 2. Findings are now sorted by (rule_id, severity DESC) before applying diminishing weights, ensuring the highest-severity occurrence of each rule always receives the full weight regardless of input order.

@keshprad

… sort Add test_same_rule_low_before_critical_sorted_correctly that feeds [LOW, CRITICAL] for the same rule_id. Without the severity sort the score is 30 (LOW gets full weight); with it the score is 52 (CRITICAL gets full weight). This locks in the fix and guards against regression. Addresses review feedback from @keshprad and @rng1995. Signed-off-by: mimran-khan <mohammed_imran.khan@outlook.com>

mimran-khan · 2026-06-23T09:42:28Z

Addressed — pushed the regression test:

test_same_rule_low_before_critical_sorted_correctly: feeds [LOW, CRITICAL] for the same rule_id and asserts score 52. Without the severity sort the score would be 30 (LOW gets full weight at 51.0=5, CRITICAL gets diminished at 500.5=25). With the sort it's 52 (CRITICAL first at 501.0=50, LOW second at 50.5=2.5).

Branch rebased on latest main (5 PRs merged upstream today).

rng1995

[Automated SkillSpector Review]

Re-review: blocker resolved — approving.

The requested input-order regression test is now present: test_same_rule_low_before_critical_sorted_correctly feeds [LOW, CRITICAL] for the same rule_id and asserts score == 52. That directly locks in the (rule_id, severity_rank) sort — without it the same input scores 30 (LOW full-weight, CRITICAL half), so the test would fail on the pre-sort behavior. Implementation and docstring were already correct. Approving.

keshprad

Re-review: requested regression coverage is now present and the change looks good.

The new test_same_rule_low_before_critical_sorted_correctly feeds [LOW, CRITICAL] for the same rule_id and asserts score 52, directly covering the input-order case from the previous review. I verified the report tests and the full default test suite against the PR source with PYTHONPATH=src; both pass. Approving.

keshprad requested changes Jun 22, 2026

View reviewed changes

rng1995 requested changes Jun 23, 2026

View reviewed changes

mimran-khan added 2 commits June 23, 2026 15:03

mimran-khan force-pushed the fix/scoring-confidence-docs-and-sort branch from 6c9287e to 4d50828 Compare June 23, 2026 09:35

rng1995 mentioned this pull request Jun 23, 2026

Merge updates from upstream and integrate local security report optimizations #163

Open

rng1995 approved these changes Jun 24, 2026

View reviewed changes

mimran-khan requested a review from keshprad June 24, 2026 08:37

keshprad approved these changes Jun 24, 2026

View reviewed changes

keshprad merged commit 7bc9c0f into NVIDIA:main Jun 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(scoring): document confidence scaling, sort by severity within rule bucket#153

fix(scoring): document confidence scaling, sort by severity within rule bucket#153
keshprad merged 2 commits into
NVIDIA:mainfrom
mimran-khan:fix/scoring-confidence-docs-and-sort

mimran-khan commented Jun 22, 2026

Uh oh!

keshprad left a comment

Uh oh!

rng1995 left a comment

Uh oh!

rng1995 left a comment

Uh oh!

mimran-khan commented Jun 23, 2026

Uh oh!

mimran-khan commented Jun 23, 2026

Uh oh!

rng1995 left a comment

Uh oh!

keshprad left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mimran-khan commented Jun 22, 2026

Summary

Test plan

Uh oh!

keshprad left a comment

Choose a reason for hiding this comment

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Re-review

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Re-review — implementation is correct, but the requested regression test is still missing

Uh oh!

mimran-khan commented Jun 23, 2026

Uh oh!

mimran-khan commented Jun 23, 2026

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

keshprad left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants