Skip to content

[Bug] SARIF output non-compliant — missing required rules[] array in tool.driver #148

Description

@mimran-khan

Summary

Imagine a hospital that runs blood tests and emails you a lab report in a format that no other medical software can read — your doctor's system can't import it, your health app can't display it, and the insurance portal rejects it as corrupted. The test results are fine, but the container they're delivered in is broken.

SkillSpector's SARIF output claims to follow the SARIF v2.1.0 standard, but it omits the required rules[] array from tool.driver. This means the output cannot be consumed by any SARIF-compliant tool (GitHub Code Scanning, VS Code SARIF Viewer, Azure DevOps, DefectDojo) because they expect every result.ruleId to have a corresponding entry in tool.driver.rules[] with metadata like name, description, severity, and help text.

Why This Matters — Real-World Scenario

Scenario: GitHub Advanced Security integration

An organization configures their CI pipeline to run SkillSpector and upload results to GitHub Code Scanning via the SARIF upload endpoint (POST /repos/{owner}/{repo}/code-scanning/sarifs).

The upload succeeds (GitHub accepts syntactically valid JSON), but when developers open the Security tab, they see:

  • No rule descriptions — just raw rule IDs like "SQP-2" or "PE3"
  • No severity levels — GitHub cannot determine if findings are critical or informational
  • No help links — developers cannot learn what each finding means or how to remediate

The security team wanted GitHub Code Scanning to show rich, actionable findings. Instead, they get an opaque list of IDs. They must manually cross-reference every finding with SkillSpector documentation. With 50+ findings per scan, this defeats the purpose of automated security scanning integration.

Some stricter SARIF consumers (e.g., certain SIEM imports, DefectDojo strict mode) reject the file entirely because the schema validation fails on missing rules[].

Reproduction

skillspector scan ./any-skill/ --format sarif -o output.sarif

python -c "
import json
data = json.load(open('output.sarif'))
run = data['runs'][0]
driver = run['tool']['driver']
print(f'Driver name: {driver.get(\"name\")}')
print(f'Driver version: {driver.get(\"version\")}')
print(f'Rules array: {driver.get(\"rules\", \"MISSING\")}')
print(f'Results count: {len(run.get(\"results\", []))}')

# Expected output:
# Driver name: SkillSpector
# Driver version: 2.2.3
# Rules array: MISSING
# Results count: 15 (results reference ruleIds that have no definition)
"

# Validate against SARIF schema
pip install sarif-tools
sarif validate output.sarif
# Error: tool.driver.rules is required when results reference ruleId

Root Cause

In src/skillspector/sarif_models.py, the SarifDriver model (lines 79-83):

class SarifDriver(BaseModel):
    name: str
    version: str
    # Missing: rules: list[SarifReportingDescriptor] = []

The model only defines name and version. There is no rules field, so rule metadata (descriptions, severity, help URIs) is never serialized.

In src/skillspector/nodes/report.py, the _build_sarif() function (lines 127-137):

def _build_sarif(findings: list[dict]) -> dict:
    tool = SarifTool(driver=SarifDriver(
        name="SkillSpector",
        version=__version__
    ))
    # Results are built with ruleId references...
    results = [SarifResult(ruleId=f["rule_id"], ...) for f in findings]
    # ...but no corresponding rule definitions exist in driver.rules

Each SarifResult has a ruleId field pointing to identifiers like "PE3", "MP2", "SQP-2", but since tool.driver has no rules[] array, these IDs are orphaned references.

Per SARIF v2.1.0 spec (section 3.19.23): "If result.ruleId is present, the tool component identified by result.rule.toolComponent SHALL contain a reportingDescriptor object whose id property matches."

Impact

  • GitHub Code Scanning: Findings appear without descriptions, severity, or help text — reducing them to opaque codes
  • Schema validation failure: Strict SARIF consumers reject the file outright
  • No tool interoperability: VS Code SARIF Viewer, Azure DevOps, DefectDojo, and SIEM integrations cannot enrich findings with rule metadata
  • Compliance gap: Organizations requiring SARIF-standard outputs for audit trails get non-compliant files
  • Manual work: Security teams must manually map rule IDs to descriptions instead of relying on the standard's self-documenting format

Affected Version

SkillSpector v2.2.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions