feat(cli): add --recursive flag for multi-skill directory scanning#141
Conversation
|
PS: These issues were found while running bunch of skill evals using SkillSpector and hence raising all the issues so that i can use these with correct reference to Upstream code |
rng1995
left a comment
There was a problem hiding this comment.
Review: add --recursive flag for multi-skill directory scanning
Nice feature. The detection logic is appropriately bounded and the detect_skills unit tests are thorough. Approving with non-blocking nits.
Security / robustness (looks good)
- Detection only inspects immediate child directories (one level), skips hidden dirs, and uses
yaml.safe_load. The frontmatter regex\n---\s*\nis linear (no ReDoS), and there's no unbounded filesystem walk — so no path-explosion concern. - One thing to be aware of:
child.is_dir()follows symlinks, so a symlinked subdirectory would be scanned. Acceptable for a user-initiated local scan, but worth a comment or anis_symlink()guard if you want to be strict.
Non-blocking nits
--recursiveis a slight misnomer — it scans only immediate child directories, not a recursive tree. Consider clarifying the help text to avoid surprise.- Dead code: the
finally: passinside the per-skill loop can be removed. - Per-skill cleanup: the multi-skill path
returns before the outertry/finally, so per-skill results never pass through_cleanup_result. Local sub-skill dirs don't create temp dirs today, but if a sub-skill scan ever setstemp_dir_for_cleanupit would leak. Consider cleaning up each result. - Typing:
_scan_multi_skill(detection: object, ...)+assert isinstance(...)is awkward. Sincemulti_skillis already imported, importMultiSkillDetectionResultand type the parameter directly. - Summary count source: the summary uses
result.get("findings")(the raw accumulated list), while score/severity come fromfiltered_findings. This count can therefore differ from the per-skill report (and from #142's deduped list). Consider usingfiltered_findingsfor consistency. - Test gap:
detect_skillsis well covered, but_scan_multi_skillorchestration (themax_score > 50exit code, combined JSON output) has no tests. Worth adding.
None of these are correctness or security issues, hence the approval.
|
Thanks for the approval! Addressed the non-blocking nits:
Items 5 (symlink guard) and 6 (orchestration tests) noted for follow-up — keeping this PR focused on the current nits. |
When a directory contains multiple independent skills (each with its own SKILL.md) but no root-level SKILL.md, SkillSpector previously treated the entire tree as one monolithic skill—summing all findings into a single inflated score and reporting the skill name as "unknown". This commit adds: - Multi-skill directory detection (detect_skills) that identifies directories containing 2+ sub-skills by scanning for SKILL.md in immediate subdirectories - A --recursive (-r) CLI flag that invokes the graph once per detected sub-skill and produces per-skill reports with a combined summary table - Auto-detection warning when --recursive is not specified but multiple skills are detected, guiding users to the correct flag - JSON combined output format for programmatic consumption Skill names are extracted from SKILL.md frontmatter with a fallback to the directory name. Hidden directories are skipped. Directories with a root SKILL.md are always treated as single skills regardless of nested SKILL.md files. Fixes NVIDIA#136 Signed-off-by: Mohammed Imran Khan <mohammed_imran.khan@outlook.com>
… findings source - Use MultiSkillDetectionResult type directly instead of object + assert - Remove dead `finally: pass` block - Clarify --recursive help text (scans immediate subdirectories, not full tree) - Use filtered_findings (not raw findings) for summary counts and JSON output
07cd036 to
484af22
Compare
rng1995
left a comment
There was a problem hiding this comment.
Re-review: --recursive flag (refreshing approval on latest commit)
Re-confirming approval on the current head (484af228) — the follow-up commits addressed the actionable nits from my earlier review:
--recursivehelp text clarified to state it scans immediate subdirectories containingSKILL.md.- Dead
finally: passremoved. _scan_multi_skillnow takesMultiSkillDetectionResultdirectly (no moreobject+assert isinstance).- Summary/JSON counts now use
filtered_findings(falling back tofindings) for consistency with per-skill reports.
The two deferred items — a symlink is_symlink() guard on child directories, and orchestration tests for _scan_multi_skill's exit-code / combined-JSON paths — are fine as follow-ups; neither is a correctness or security concern. Thanks!
rng1995
left a comment
There was a problem hiding this comment.
Re-approval on the latest commit (484af228)
Re-confirming my approval on the current head — the author addressed four of my six non-blocking nits and the remaining items are reasonable follow-ups. Still approving.
Addressed:
- Help text clarified —
--recursivenow reads "Scan directories containing multiple skills (immediate subdirectories with SKILL.md) independently," which removes the "recursive tree" surprise. - Dead code removed — the
finally: passin the per-skill loop is gone. - Typing fixed —
_scan_multi_skill(detection: MultiSkillDetectionResult, ...)is now typed directly (imported at the top ofcli.py); no moreobject+assert isinstance. - Summary count source — the summary table now uses
result.get("filtered_findings") or result.get("findings"), consistent with the per-skill report (and with #142's deduped scoring).
Deferred (acceptable as follow-ups — none are correctness/security issues):
- Optional
is_symlink()guard on child directories — fine to skip for a user-initiated local scan. - Orchestration tests for
_scan_multi_skill(themax_score > 50exit code, combined JSON output) —detect_skillsremains well covered; adding these later is reasonable.
(For a future pass: the per-skill results in _scan_multi_skill still return before the outer try/finally, so they don't pass through _cleanup_result — harmless today since sub-skill scans don't create temp dirs, just noting it alongside the other follow-ups.)
Nice feature — approving on 484af228.
Summary
When a directory contains multiple independent skills (each with its own
SKILL.md) but no root-levelSKILL.md, SkillSpector previously treated the entire tree as one monolithic skill — summing all findings into a single inflated score and reporting the skill name as"unknown".Changes
multi_skill.py: New module withdetect_skills()that identifies directories containing 2+ sub-skills by scanning forSKILL.mdin immediate subdirectories. Extracts skill names from frontmatter with directory-name fallback.--recursive(-r) CLI flag: When set, invokes the LangGraph workflow once per detected sub-skill, producing per-skill reports and a combined summary table.--recursiveis NOT specified but multiple skills are detected, prints a warning guiding users to the correct flag.--recursive --format json --output, writes a structured combined report with per-skill scores.Example
Design Decisions
SKILL.mdalways wins.git,.venv, etc.)Test Plan
ruff checkandruff formatcleanFixes #136