Skip to content

feat(report): add analysis_completeness field to JSON output#160

Open
mimran-khan wants to merge 1 commit into
NVIDIA:mainfrom
mimran-khan:feat/analysis-completeness-field
Open

feat(report): add analysis_completeness field to JSON output#160
mimran-khan wants to merge 1 commit into
NVIDIA:mainfrom
mimran-khan:feat/analysis-completeness-field

Conversation

@mimran-khan

Copy link
Copy Markdown
Contributor

Summary

When SkillSpector produces a "clean" scan (no findings), consumers have no way to know whether the tool actually analyzed everything or if it silently skipped components due to missing file content, LLM unavailability, or other limitations. This makes it impossible to trust a clean scan for registry gating decisions.

This PR adds an analysis_completeness section to the JSON report format that explicitly communicates scan coverage.

Example Output

{
  "analysis_completeness": {
    "total_components": 5,
    "scanned_components": 5,
    "coverage_percent": 100.0,
    "llm_analysis": "applied",
    "findings_before_filtering": 3,
    "findings_after_filtering": 1,
    "limitations": null,
    "is_complete": true
  }
}

When limitations exist:

{
  "analysis_completeness": {
    "total_components": 5,
    "scanned_components": 3,
    "coverage_percent": 60.0,
    "llm_analysis": "skipped",
    "findings_before_filtering": 2,
    "findings_after_filtering": 2,
    "limitations": [
      "2 component(s) had no content in file_cache (skipped)",
      "LLM meta-analysis unavailable: OPENAI_API_KEY not set"
    ],
    "is_complete": false
  }
}

Design Decisions

  • JSON only: SARIF has its own coverage mechanisms; terminal is human-readable. Only JSON format gets this field.
  • is_complete boolean: Enables simple programmatic checks (if not report.analysis_completeness.is_complete: warn)
  • Human-readable limitations: Each string explains what was missed and why — actionable for operators
  • Non-breaking: Field is additive; existing consumers that don't check it are unaffected

Testing

8 new tests covering:

  • Full coverage produces is_complete: true
  • Partial file_cache coverage reports skipped components
  • LLM unavailable noted in limitations
  • LLM disabled (--no-llm) noted
  • Filtered findings count tracked
  • Empty components edge case
  • JSON format includes field
  • SARIF format does NOT include field

Fixes #149

Adds an analysis_completeness section to the JSON report output that
communicates scan coverage and known limitations to consumers:

- total_components / scanned_components / coverage_percent
- llm_analysis status (applied/skipped)
- findings_before_filtering / findings_after_filtering
- limitations array (human-readable list of gaps)
- is_complete boolean for quick programmatic checks

This helps CI integrations and registry gates understand whether a
"clean" scan actually analyzed everything or if gaps exist that
require re-scanning with full capabilities.

Only included in JSON format output; SARIF and terminal formats are
unchanged.

Fixes NVIDIA#149
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] LLM API failures silently produce zero findings with no retry or user notification

1 participant