Skip to content

fix(meta-analyzer): add heuristic fallback filter for --no-llm mode#143

Open
mimran-khan wants to merge 2 commits into
NVIDIA:mainfrom
mimran-khan:fix/no-llm-heuristic-fallback-138
Open

fix(meta-analyzer): add heuristic fallback filter for --no-llm mode#143
mimran-khan wants to merge 2 commits into
NVIDIA:mainfrom
mimran-khan:fix/no-llm-heuristic-fallback-138

Conversation

@mimran-khan

Copy link
Copy Markdown
Contributor

Summary

The --no-llm mode (default when no LLM provider is configured) previously bypassed the meta-analyzer entirely, presenting raw unfiltered regex matches as final results. This caused the same skill to score 100/100 without LLM but 35/100 with LLM analysis — the LLM-based filter correctly removed 70%+ of false positives that the static pass-through could not.

This is the typical experience for first-time users and CI/CD pipelines that evaluate SkillSpector without an LLM configured.

Changes

  • Heuristic fallback filter (_fallback_filtered): replaces the previous pass-through with two rule-based checks:
    1. Confidence threshold (0.4): drops findings below this confidence level
    2. Code example detection: drops findings whose context matches is_code_example() indicators (fenced code blocks, "example:", "e.g.", etc.)
  • Report metadata transparency: adds meta_analysis_applied (boolean) and filtering_mode: "heuristic" to JSON/report output, so consumers know the filtering level applied

Behavioral Impact

Scenario Before After
--no-llm with docs code blocks All raw findings reported Code example findings filtered
--no-llm with low-confidence matches All reported (confidence 0.1-0.3) Dropped below 0.4 threshold
--no-llm with genuine findings (confidence >= 0.4) Reported Still reported
JSON output metadata llm_requested: false + meta_analysis_applied: false, filtering_mode: "heuristic"

Test Plan

  • 11 unit tests covering confidence threshold (boundary, below, above), code example filtering (fenced blocks, keywords, normal context, no context), combined filtering, and remediation application
  • Full test suite: 687 passed, 0 failures, 0 regressions
  • ruff check and ruff format clean

Fixes #138

@mimran-khan

Copy link
Copy Markdown
Contributor Author

PS: These issues were found while running bunch of skill evals using SkillSpector and hence raising all the issues so that i can use these with correct reference to Upstream code

@rng1995 rng1995 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: heuristic fallback filter for --no-llm mode

The meta_analysis_applied / filtering_mode: "heuristic" metadata is a good transparency addition. However I'm requesting changes: this turns the --no-llm fallback from a pass-through into a filter that can silently drop genuine high-severity detections, which is a fail-open regression for a security tool.

Blocking issues

1. Confidence drop has no severity floor.

if f.confidence < _NO_LLM_CONFIDENCE_THRESHOLD:  # 0.4
    continue

A CRITICAL/HIGH finding emitted at, say, 0.35 confidence is dropped entirely — gone from both score and report. High-severity detections shouldn't be discarded on confidence alone.

2. is_code_example drop with no adjudication.
This reuses the same broad, attacker-controllable substring heuristic discussed in #140 (example:, e.g., such as, documentation, fences, # note: …). In --no-llm mode there is no LLM to second-guess it, so an author can suppress a real finding by salting nearby context.

3. Applies on LLM failure too (fail-open).
_fallback_filtered is invoked not only for --no-llm but also from the except path when an LLM call fails. So a transient LLM error now silently drops findings that previously passed through — exactly when analysis is already degraded, the tool becomes less conservative.

Recommendations

  • Never drop HIGH/CRITICAL findings on confidence; gate any confidence threshold by severity (or downweight rather than drop).
  • Prefer reducing confidence over removing findings, so they remain in the report and still contribute (reduced) score.
  • On the LLM-failure fallback path, prefer passing findings through (fail-closed for a security tool) rather than applying the heuristic drop.

Tests

Please add: (a) a HIGH/CRITICAL finding below the threshold is retained once a severity gate exists, and (b) coverage for the LLM-failure fallback path.

Inter-PR note

This compounds with #140 (doc-path 0.3x confidence) and #139 (confidence-weighted scoring): a documentation finding reduced to ~0.27 by #140 would then be dropped here by the 0.4 threshold. Across #139/#140/#142/#143 the net effect is a notable drop in detection sensitivity, most pronounced in --no-llm mode — please validate end-to-end that genuine high-severity findings still surface.

@mimran-khan

Copy link
Copy Markdown
Contributor Author

Thanks for the detailed review. All three blocking issues have been addressed:

1. Severity floor added:
CRITICAL and HIGH findings are never dropped on confidence alone, regardless of threshold. Only LOW and MEDIUM findings are subject to the 0.4 confidence gate.

2. Code-example context: downweight instead of drop:
In --no-llm mode there's no LLM safety net, so is_code_example now halves confidence (0.5x) instead of hard-dropping. A HIGH finding in code-example context at 0.6 confidence becomes 0.3 — still below threshold, but the severity floor keeps it in the report.

3. LLM failure path: fail-closed:
New _passthrough_with_defaults() function passes ALL findings through unchanged (adding default remediations only) when the LLM call fails. The heuristic filter is now only used for the explicit --no-llm path, not the error fallback.

Tests added:

  • TestSeverityFloor — CRITICAL/HIGH findings at low confidence are retained
  • TestCodeExampleFiltering — findings are downweighted not dropped; severity floor still protects
  • TestLLMFailurePassthrough — verifies passthrough preserves all findings and that meta_analyzer uses passthrough on exception

Cross-PR compound effect:
With these changes, the worst case for a genuine HIGH finding in documentation context is: 0.9 * 0.3 (doc-path from #140) * 0.5 (code-example) = 0.135 confidence — below threshold, but retained by the severity floor. End-to-end: genuine high-severity findings always surface.

The --no-llm mode (default when no LLM provider is configured) previously
bypassed the meta-analyzer entirely, presenting raw unfiltered regex matches
as final results. This caused the same skill to score 100/100 without LLM
but 35/100 with it, as the LLM-based filter correctly removed 70%+ of false
positives that the static pass-through could not.

This commit replaces the pass-through _fallback_filtered with a heuristic
filter that applies two rule-based checks in --no-llm mode:
1. Confidence threshold: drop findings below 0.4 confidence
2. Code example detection: drop findings whose context matches
   is_code_example() indicators (fenced blocks, "example:", etc.)

Additionally, report metadata now includes:
- meta_analysis_applied: boolean indicating if LLM filtering was used
- filtering_mode: "heuristic" when LLM was not applied

This gives --no-llm users significantly cleaner results without requiring
LLM access, and makes the output transparent about its filtering level.

Fixes NVIDIA#138

Signed-off-by: Mohammed Imran Khan <mohammed_imran.khan@outlook.com>
…ail-closed on LLM error

Three fixes to the heuristic fallback filter:

1. Severity floor: CRITICAL and HIGH findings are never dropped on
   confidence alone, regardless of threshold. Only LOW and MEDIUM
   findings are subject to the 0.4 confidence gate.

2. Code-example context: downweight confidence by 0.5x instead of
   hard-dropping. In --no-llm mode there is no LLM safety net, so
   dropping findings based on spoofable context indicators is a
   detection-bypass risk.

3. LLM failure path (fail-closed): when LLM call raises an exception,
   all findings now pass through with default remediations instead of
   being filtered. A security tool should show more findings (not fewer)
   when analysis is degraded.
@mimran-khan mimran-khan force-pushed the fix/no-llm-heuristic-fallback-138 branch from 69cb045 to 805009f Compare June 22, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] --no-llm mode bypasses meta-analyzer entirely, presenting unfiltered raw findings as final results

2 participants