Agent Skills - Discoverability by melo-gonzo · Pull Request #1688 · NVIDIA/physicsnemo

melo-gonzo · 2026-05-29T18:22:51Z

PhysicsNeMo Pull Request

Description

This PR sets up canonical agents skills layout and adds a validated skill for 'discovery' of the physicsnemo codebase. Useful for new users, understanding how physicsnemo can help with a particular user problem, and guides users through best places to start as it relates to their specific problem of interest. Does not write, modify, or distribute code - purely for efficient information sharing and surfacing details that may otherwise be hidden in docs, code snippets, or example folders.

NVIDIA validated signature, benchmark results, and evaluation prompts are included per process guidelines.

Note, linters and pre-commit were updated to exclude these peripheral files from formatting as they would render signatures invalid.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.
If I am implementing a new model or modifying any existing model, I have followed the Models Implementation Coding Standards.

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

…n will render signatures invalid.

…coverability-skill

greptile-apps · 2026-05-29T18:26:08Z

Greptile Summary

This PR establishes the canonical agent-skills directory layout for the PhysicsNeMo repo and introduces the physicsnemo-discover skill — a read-only, live-discovery-oriented agent skill that guides users toward the right model families, datapipes, and examples for SciML/AI4Science tasks without writing or modifying any code.

Structure: A new .agents/skills symlink points to the repo-root skills/ directory; markdownlint is excluded from skills/ in both .pre-commit-config.yaml and .markdownlintignore to preserve the cryptographic signature over the skill files.
Skill content: SKILL.md defines a "discover, don't remember" philosophy with a structured output template, an abstention path for out-of-scope queries, and companion TAXONOMY.md/RECIPES.md reference files; benchmark results (NVSkills-Eval, 4 tasks, PASS) and an NVIDIA-signed skill.oms.sig are included per process guidelines.
Notable benchmark signals: Both evaluation agents show negative effectiveness uplift (-9% claude-code, -5% codex), and the Tier 1 validator raised MEDIUM findings for missing ## Instructions/## Examples schema sections and an outbound git-clone instruction (SDI-2) that has no fallback when network access is unavailable.

Important Files Changed

Filename	Overview
skills/physicsnemo-discover/SKILL.md	Core skill definition: well-structured with clear scope, output format, abstention template, and live-discovery philosophy; contains a git-clone instruction for headless environments that the benchmark validator flagged as a medium security concern with no fallback if the network call fails.
skills/physicsnemo-discover/BENCHMARK.md	Evaluation report showing overall PASS with 4 tasks; both agents show negative effectiveness uplift (-9% claude-code, -5% codex) and several unresolved MEDIUM findings (missing schema sections, long description, git-clone SDI-2).
skills/physicsnemo-discover/references/RECIPES.md	Concrete Glob/Grep/Read discovery patterns; well-organized with 11 recipe sections covering all major discovery axes, no issues found.
skills/physicsnemo-discover/references/TAXONOMY.md	Navigation scaffold with data-shape routing tables, domain maps, and stability tiers; explicitly warns against using it as a static inventory, no issues found.
skills/physicsnemo-discover/evals/evals.json	Four evaluation tasks (2 positive, 2 negative) with ground-truth and expected-behavior checks; covers the core abstention case and a clear-match case.
.pre-commit-config.yaml	Adds `exclude: ^skills/` to the markdownlint hook; consistent with the `.markdownlintignore` additions since git tracks skill files under `skills/` (the symlink target), not `.agents/skills/`.
.agents/skills	New symlink `.agents/skills → ../skills` following the canonical agent-skills layout convention; no issues.

Comments Outside Diff (3)

skills/physicsnemo-discover/BENCHMARK.md, line 107-108 (link)

Negative effectiveness uplift on both agents

The benchmark shows the skill reduces effectiveness compared to the no-skill baseline for both claude-code (-9%) and codex (-5%). This means agents following the skill complete the evaluated tasks at a lower rate than they would without it. The effect may be an artifact of a 4-task dataset (low statistical power), but it is the most visible signal suggesting the workflow overhead introduced by the skill's discovery constraints outweighs the guidance benefit in at least some cases. It would be worth clarifying whether this dimension is expected to be negative for a discovery-only skill, or whether the evaluation tasks include scenarios where the skill is mistakenly activated and penalized.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
skills/physicsnemo-discover/SKILL.md, line 178 (link)

Shallow-clone instruction flagged by the benchmark validator

The benchmark's own Tier 1 static validation flagged this line as MEDIUM SECURITY/SDI-2 because it instructs agents to shallow-clone an external Git repository. The mitigations are present (read-only intent, URL is hardcoded, no execution of cloned code), but as written the instruction tells a generally-capable agent to run an outbound network call (git clone https://github.com/NVIDIA/physicsnemo). In sandboxed or network-restricted agent environments this will silently fail, leaving the agent without a repo to search — and the skill has no fallback behavior for that case. A note directing agents to skip the clone and ask the user for the repo path when the network call fails would improve robustness.
skills/physicsnemo-discover/BENCHMARK.md, line 118-122 (link)

Missing recommended sections acknowledged but not resolved

The Tier 1 validator surfaced MEDIUM findings for the absent ## Instructions and ## Examples sections in SKILL.md. These sections are typically required for agents to understand how to invoke the skill correctly and to illustrate expected output. The benchmark passed the overall bar despite these gaps, but leaving them unaddressed risks agents misloading or misusing the skill in edge-case activation scenarios.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

_{Reviews (1): Last reviewed commit: "docs: changelog" | Re-trigger Greptile}

NickGeneva

LGTM

NickGeneva · 2026-05-29T19:16:51Z

/ok to test 8a69fb5

NickGeneva · 2026-05-29T19:37:22Z

/blossom-ci

melo-gonzo added 4 commits May 29, 2026 12:17

fix: updating markdownlint and pre-commit to skip skills. modificatio…

dc65efd

…n will render signatures invalid.

feat: initial commit of signed skill for physicsnemo-discover

1bfb3b3

Merge branch 'main' of https://github.com/NVIDIA/physicsnemo into dis…

100b521

…coverability-skill

docs: changelog

8a69fb5

melo-gonzo self-assigned this May 29, 2026

melo-gonzo requested review from NickGeneva, coreyjadams and ktangsali as code owners May 29, 2026 18:22

NickGeneva reviewed May 29, 2026

View reviewed changes

Comment thread .pre-commit-config.yaml

NickGeneva approved these changes May 29, 2026

View reviewed changes

melo-gonzo enabled auto-merge May 29, 2026 19:57

melo-gonzo added this pull request to the merge queue May 29, 2026

Merged via the queue into NVIDIA:main with commit f103a41 May 29, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent Skills - Discoverability#1688

Agent Skills - Discoverability#1688
melo-gonzo merged 4 commits into
NVIDIA:mainfrom
melo-gonzo:discoverability-skill

melo-gonzo commented May 29, 2026

Uh oh!

greptile-apps Bot commented May 29, 2026 •

edited

Loading

Comments Outside Diff (3)

Uh oh!

Uh oh!

NickGeneva left a comment

Uh oh!

NickGeneva commented May 29, 2026

Uh oh!

NickGeneva commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

melo-gonzo commented May 29, 2026

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Review Process

Uh oh!

greptile-apps Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Important Files Changed

Comments Outside Diff (3)

Uh oh!

Uh oh!

NickGeneva left a comment

Choose a reason for hiding this comment

Uh oh!

NickGeneva commented May 29, 2026

Uh oh!

NickGeneva commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented May 29, 2026 •

edited

Loading