fix(meta-analyzer): add heuristic fallback filter for --no-llm mode#143
fix(meta-analyzer): add heuristic fallback filter for --no-llm mode#143mimran-khan wants to merge 2 commits into
Conversation
|
PS: These issues were found while running bunch of skill evals using SkillSpector and hence raising all the issues so that i can use these with correct reference to Upstream code |
rng1995
left a comment
There was a problem hiding this comment.
Review: heuristic fallback filter for --no-llm mode
The meta_analysis_applied / filtering_mode: "heuristic" metadata is a good transparency addition. However I'm requesting changes: this turns the --no-llm fallback from a pass-through into a filter that can silently drop genuine high-severity detections, which is a fail-open regression for a security tool.
Blocking issues
1. Confidence drop has no severity floor.
if f.confidence < _NO_LLM_CONFIDENCE_THRESHOLD: # 0.4
continueA CRITICAL/HIGH finding emitted at, say, 0.35 confidence is dropped entirely — gone from both score and report. High-severity detections shouldn't be discarded on confidence alone.
2. is_code_example drop with no adjudication.
This reuses the same broad, attacker-controllable substring heuristic discussed in #140 (example:, e.g., such as, documentation, fences, # note: …). In --no-llm mode there is no LLM to second-guess it, so an author can suppress a real finding by salting nearby context.
3. Applies on LLM failure too (fail-open).
_fallback_filtered is invoked not only for --no-llm but also from the except path when an LLM call fails. So a transient LLM error now silently drops findings that previously passed through — exactly when analysis is already degraded, the tool becomes less conservative.
Recommendations
- Never drop
HIGH/CRITICALfindings on confidence; gate any confidence threshold by severity (or downweight rather than drop). - Prefer reducing confidence over removing findings, so they remain in the report and still contribute (reduced) score.
- On the LLM-failure fallback path, prefer passing findings through (fail-closed for a security tool) rather than applying the heuristic drop.
Tests
Please add: (a) a HIGH/CRITICAL finding below the threshold is retained once a severity gate exists, and (b) coverage for the LLM-failure fallback path.
Inter-PR note
This compounds with #140 (doc-path 0.3x confidence) and #139 (confidence-weighted scoring): a documentation finding reduced to ~0.27 by #140 would then be dropped here by the 0.4 threshold. Across #139/#140/#142/#143 the net effect is a notable drop in detection sensitivity, most pronounced in --no-llm mode — please validate end-to-end that genuine high-severity findings still surface.
|
Thanks for the detailed review. All three blocking issues have been addressed: 1. Severity floor added: 2. Code-example context: downweight instead of drop: 3. LLM failure path: fail-closed: Tests added:
Cross-PR compound effect: |
The --no-llm mode (default when no LLM provider is configured) previously bypassed the meta-analyzer entirely, presenting raw unfiltered regex matches as final results. This caused the same skill to score 100/100 without LLM but 35/100 with it, as the LLM-based filter correctly removed 70%+ of false positives that the static pass-through could not. This commit replaces the pass-through _fallback_filtered with a heuristic filter that applies two rule-based checks in --no-llm mode: 1. Confidence threshold: drop findings below 0.4 confidence 2. Code example detection: drop findings whose context matches is_code_example() indicators (fenced blocks, "example:", etc.) Additionally, report metadata now includes: - meta_analysis_applied: boolean indicating if LLM filtering was used - filtering_mode: "heuristic" when LLM was not applied This gives --no-llm users significantly cleaner results without requiring LLM access, and makes the output transparent about its filtering level. Fixes NVIDIA#138 Signed-off-by: Mohammed Imran Khan <mohammed_imran.khan@outlook.com>
…ail-closed on LLM error Three fixes to the heuristic fallback filter: 1. Severity floor: CRITICAL and HIGH findings are never dropped on confidence alone, regardless of threshold. Only LOW and MEDIUM findings are subject to the 0.4 confidence gate. 2. Code-example context: downweight confidence by 0.5x instead of hard-dropping. In --no-llm mode there is no LLM safety net, so dropping findings based on spoofable context indicators is a detection-bypass risk. 3. LLM failure path (fail-closed): when LLM call raises an exception, all findings now pass through with default remediations instead of being filtered. A security tool should show more findings (not fewer) when analysis is degraded.
69cb045 to
805009f
Compare
Summary
The
--no-llmmode (default when no LLM provider is configured) previously bypassed the meta-analyzer entirely, presenting raw unfiltered regex matches as final results. This caused the same skill to score 100/100 without LLM but 35/100 with LLM analysis — the LLM-based filter correctly removed 70%+ of false positives that the static pass-through could not.This is the typical experience for first-time users and CI/CD pipelines that evaluate SkillSpector without an LLM configured.
Changes
_fallback_filtered): replaces the previous pass-through with two rule-based checks:is_code_example()indicators (fenced code blocks, "example:", "e.g.", etc.)meta_analysis_applied(boolean) andfiltering_mode: "heuristic"to JSON/report output, so consumers know the filtering level appliedBehavioral Impact
--no-llmwith docs code blocks--no-llmwith low-confidence matches--no-llmwith genuine findings (confidence >= 0.4)llm_requested: falsemeta_analysis_applied: false,filtering_mode: "heuristic"Test Plan
ruff checkandruff formatcleanFixes #138