Skip to content

[Fix] Exclude numpy 2.3.5 from every IsaacLab install path#5656

Open
hujc7 wants to merge 3 commits into
isaac-sim:developfrom
hujc7:jichuanh/force-numpy-install
Open

[Fix] Exclude numpy 2.3.5 from every IsaacLab install path#5656
hujc7 wants to merge 3 commits into
isaac-sim:developfrom
hujc7:jichuanh/force-numpy-install

Conversation

@hujc7
Copy link
Copy Markdown
Collaborator

@hujc7 hujc7 commented May 16, 2026

1. Summary

Adds numpy!=2.3.5 at every install site that pulls numpy (10 touchpoints: 6 setup.py, 2 install.py pip argvs, 2 Dockerfile pip lines) so the broken release whose vendored OpenBLAS atfork handler crashes Kit's libomni.platforminfo fork() never lands in any pip resolve. Final image ends on numpy 2.3.4; supersedes #5642.

2. Sites covered

!=2.3.5 is the only delta added at each site — no >=2 constraints introduced where none existed.

File Site Why this site sees numpy
source/isaaclab/setup.py INSTALL_REQUIRES direct numpy dep
source/isaaclab_tasks/setup.py INSTALL_REQUIRES direct numpy dep
source/isaaclab_rl/setup.py INSTALL_REQUIRES direct numpy dep
source/isaaclab_visualizers/setup.py INSTALL_REQUIRES direct numpy dep
source/isaaclab_teleop/setup.py new INSTALL_REQUIRES entry transitive: dex-retargetingpin → cmeel-boost
source/isaaclab_mimic/setup.py new INSTALL_REQUIRES entry transitive: h5py
source/isaaclab/.../install.py _ensure_pink_ik_dependencies_installed pip argv force-reinstall of pin pin-pink daqp
source/isaaclab/.../install.py _maybe_preinstall_arm_nlopt pip argv ARM nlopt source-build prep
docker/Dockerfile.base ARM nlopt prep RUN pip install mirrors _maybe_preinstall_arm_nlopt
docker/Dockerfile.curobo ARM nlopt prep + nvidia-curobo install both can pull numpy under the cmeel cap

isaaclab_teleop's post---install reinstall in Dockerfile.curobo is covered by the teleop setup.py entry — no separate Dockerfile argv needed.

3. Design notes

3.1 Why declared at every site, not just source/isaaclab/setup.py

pip's resolver runs per-invocation. pip install -e <pkg> collects only <pkg>'s direct + transitive deps for that resolve — it does not re-enforce constraints declared by other already-installed packages. So numpy!=2.3.5 in isaaclab/setup.py is honored when pip install -e source/isaaclab is the target, but ignored during pip install -e source/isaaclab_teleop (which pulls cmeel-boost) and during the pin-pink force-reinstall. The !=2.3.5 constraint has to appear in every numpy-pulling resolve.

3.2 Why this lands on 2.3.4, not >=2.4

pin-pink pulls pin → cmeel-boost, whose Requires-Dist: numpy >=2.3, <2.4 is a hard upper bound. Combined with !=2.3.5, pip picks 2.3.4. The atfork bug is exclusive to 2.3.5's vendored OpenBLAS hash, so 2.3.4 is safe.

3.3 Path to numpy >= 2.4 (out of scope here — see §8)

Adding numpy>=2.4 in the same pip resolve conflicts with cmeel-boost's <2.4 cap → ResolutionImpossible. The only way past it is a separate pip install --upgrade numpy>=2.4.1 invocation that runs after cmeel-boost is already installed: pip prints one resolver warning about cmeel-boost's cap, then installs numpy 2.4.5 anyway (verified locally; numpy's stable C ABI since 2.0 keeps cmeel's compiled extensions working). That step is fragile in a refactor — see §8 for the design with safety nets if it's wanted.

4. Validation

Single-variable A/B between #5655 (negative arm, no fix) and #5684 (positive arm, fix cherry-picked)

Both branched from develop (a9b62101ca6) with an identical 1-file source/conftest.py diagnostic that prints the resolved numpy version + bundled OpenBLAS hash at pytest session start. The only difference between the two branches is the 10-site exclusion in this PR — git diff between them equals exactly this PR's patch.

dep-manifest in every CI job confirms which numpy bundle actually landed:

  • Negative arm jobs: numpy 2.3.5 + libscipy_openblas64_-fdde5778.so (the broken bundle).
  • Positive arm jobs: numpy 2.3.4 + libscipy_openblas64_-8fb3d286.so (the safe bundle).

Result: 4 jobs flipped FAIL → PASS — direct causal evidence of the atfork bug being resolved by this PR:

Job Neg-arm (numpy 2.3.5) Pos-arm (numpy 2.3.4)
isaaclab (core) [3/3] atfork SIGSEGV PASS
isaaclab_mimic atfork SIGSEGV PASS
isaaclab_teleop atfork SIGSEGV PASS
test-skillgen atfork SIGSEGV PASS
isaaclab_physx atfork SIGSEGV (libomni.platforminfo.plugin.so!std::mersenne_twister_engine, signal 11) passes the atfork canary; fails on an unrelated test_contact_sensor.py assertion (develop-side regression, likely from ContactSensor #5422)
isaaclab_newton atfork SIGSEGV passes the atfork canary; fails on ModuleNotFoundError: 'isaaclab_newton.test.mock_interfaces' (develop-side test packaging issue from #4722)

Negative-arm crash backtrace example (from isaaclab_physx):

[Fatal] [carb.crashreporter-breakpad.plugin] libomni.platforminfo.plugin.so!std::mersenne_twister_engine::operator()()
⚠️  test_newton_actuators_physx.py: Process killed by signal 11 (SIGSEGV — segmentation fault)

Positive-arm same job: dep-manifest shows numpy 2.3.4 + -8fb3d286, all tests run to completion, no SIGSEGV anywhere in the log.

Cross-arm fail-on-both jobs (isaaclab_tasks [1/3], [2/3], [3/3], test-curobo) fail for develop-side reasons independent of OpenBLAS — preset / rendering / curobo test regressions that need separate triage outside this PR's scope.

Earlier local + per-version evidence (kept for completeness)

  • Local (Python 3.12 + IsaacLab dep stack, env_isaaclab_test): numpy 2.4.5 smoke + 54/54 IsaacLab Pink IK unit tests pass (test_pink_ik_components.py 21/21, test_local_frame_task.py 24/24, test_null_space_posture_task.py 9/9). The 2.3.4 surface is structurally the same — only the bundled openblas hash differs.
  • Docker base build log on this PR's SHA: every numpy install/uninstall line confirmed; final state numpy 2.3.4; no Successfully installed ... numpy-2.3.5 anywhere.
  • This PR's CI: 37/37 green (single retry on an unrelated libcarb.tasking.plugin.so!make_fcontext flake).

5. Risk

  • 2.3.4 is point-in-time safe. If numpy 2.3.6 ships a fresh regression in the 2.3.x line, another exclusion would need to be added at every site.
  • A new IsaacLab submodule with a direct or transitive numpy dep must remember to include !=2.3.5. Same risk as any per-package constraint.
  • A future pip install line in a Dockerfile that pulls numpy needs the argv too.

6. Dependencies

Single-variable A/B that validates this PR

Two paired diagnostic PRs branching off the same develop SHA, where the only delta between them is the 10-site exclusion in this PR:

git diff between the two arms equals exactly this PR's 10-site exclusion patch — no other code, action.yml, or install overrides. The canary FAIL → PASS flip with only that delta is the causal proof.

Both diagnostic PRs are do-not-merge; close after evidence is captured.

7. Test plan

  • Pre-commit (./isaaclab.sh -f) clean
  • Changelog fragments added (one per touched package)
  • CI green on this branch (37/37 after one unrelated-flake retry)
  • Local Pink IK unit-test suite green against numpy 2.4.5 (1-numpy-minor upper bound from the runtime that 2.3.4 also satisfies)
  • Docker base build log audited — numpy 2.3.5 never appears

8. Out of scope / follow-ups

8.1 Upgrade to numpy >= 2.4

A defense-in-depth bump to numpy 2.4.5 (which ships the upstream OpenBLAS atfork fix and is forward-compatible with future 2.3.x regressions) requires a separate post-install upgrade step, because the cmeel-boost cap blocks >=2.4 in any single pip resolve. Recommended shape if/when it's wanted:

Component Purpose
_ensure_numpy_above_openblas_atfork_bug() in install.py, called from command_install after _ensure_pink_ik_dependencies_installed Named function with docstring — grep-able, lives next to other install helpers so refactors notice it
RUN ... pip install --upgrade "numpy>=2.4.1" at end of Dockerfile.curobo (after teleop + curobo installs) Mirror for the cuRobo image, which re-resolves after --install and would otherwise drop back
New source/isaaclab/test/install_ci/test_numpy_version.py asserting numpy.__version__ >= "2.4.1" Hard CI gate — if the upgrade step is silently removed in a future refactor, this test fails
Comment in every Dockerfile RUN pointing at the function Reviewer of any future Dockerfile pip step sees the constraint

The CI assertion is the key piece: without it, the post-install upgrade is "separate install that can easily get lost." With it, any regression is loud.

8.2 Upstream relaxation of cmeel-boost cap

Cleanest long-term fix: file an issue with cmake-wheel/cmeel-boost asking for Requires-Dist: numpy >=2.3 (drop the <2.4 upper bound). numpy 2.x ABI has been stable since 2.0, so the cap is over-conservative. Once relaxed upstream, IsaacLab can resolve to numpy >= 2.4 natively without the post-install dance.

References

Checklist

@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label May 16, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Review Summary

PR #5656: [Fix] Force numpy>=2.4.1 in --install to escape OpenBLAS atfork SIGSEGV

✅ Overview

This PR addresses a critical SIGSEGV crash caused by numpy 2.3.5's vendored OpenBLAS (libscipy_openblas64_-fdde5778.so) during SimulationApp startup. The fix forces numpy >= 2.4.1 after the pin-pink installation step.


🔍 Key Findings

✅ Strengths

  1. Correct Root Cause Identification: The PR correctly identifies that pip install --upgrade --force-reinstall pin pin-pink==3.1.0 daqp==0.8.5 resolves numpy independently, landing on 2.3.5 due to cmeel-boost's numpy<2.4 cap.

  2. Strategic Placement: Adding the numpy upgrade after _ensure_pink_ik_dependencies_installed ensures it's the final pip operation, preventing subsequent steps from downgrading numpy.

  3. Graceful Error Handling: Using check=False with a clear warning allows the installation to complete even if the numpy upgrade fails, while informing users of the risk.

  4. Good Documentation: The inline comment references upstream issues (numpy/numpy#30092, OpenMathLib/OpenBLAS#5520) and internal tracking (OMPE-92261).

  5. Comprehensive Testing: 54/54 Pink IK unit tests verified locally against numpy 2.4.5.

⚠️ Minor Observations

  1. Early Return Addition (line 226): The return statement added after the pin-pink install failure is correct but represents a subtle behavior change - previously the function would continue (doing nothing useful) whereas now it explicitly returns. This is actually an improvement.

  2. Dependency Warning: Users may see a pip resolver warning about cmeel-boost's numpy<2.4 constraint. This is documented in the PR description as cosmetic and expected.


📋 Files Changed

File Changes Assessment
install.py +18 lines ✅ Clean implementation
jichuanh-force-numpy-install.rst +13 lines ✅ Proper changelog format

🧪 CI Status

Pre-commit and wheel build passing. Installation tests pending.


📌 Verdict

LGTM - This is a well-reasoned fix for a critical crash. The implementation follows existing patterns, handles errors gracefully, and is well-documented. The approach of forcing numpy >= 2.4.1 as the final pip step is appropriate given the constraint resolution complexities.


🤖 Review generated by isaaclab-review-bot


Update (1499d08): Reviewed new changes since beab776:

📁 New/Updated Files

File Status Assessment
docker/Dockerfile.curobo Added numpy upgrade ✅ Applies same fix to cuRobo Docker image
changelog.d/jichuanh-force-numpy-install.rst Added ✅ Comprehensive changelog entry
install.py Enhanced ✅ Better documentation with upstream issue refs

🔍 Changes Analysis

  1. Docker cuRobo Fix: Adds RUN ${ISAACLAB_PATH}/isaaclab.sh -p -m pip install --upgrade "numpy>=2.4.1" after curobo and isaaclab_teleop installs. This is necessary because nvidia-curobo and isaaclab_teleop pull in dex-retargetingpin → cmeel-boost, which would otherwise drag numpy back to 2.3.5.

  2. Changelog Entry: Well-written documentation covering both the base --install fix and the Dockerfile.curobo fix. Explains the root cause clearly.

  3. install.py Enhancement: The _ensure_numpy_above_openblas_atfork_bug function now has comprehensive docstring explaining the technical details and rationale.

✅ Verdict

Changes look good. The cuRobo Docker fix is a necessary addition to ensure the numpy upgrade is applied consistently across all installation paths. LGTM.

🤖 Update by isaaclab-review-bot


Update (6c1ad76): Reviewed new changes since 1499d08:

📁 New/Updated Files

File Status Assessment
source/isaaclab/setup.py Updated ✅ Added numpy>=2,!=2.3.5 constraint
source/isaaclab_rl/setup.py Updated ✅ Added numpy>=2,!=2.3.5 constraint
source/isaaclab_tasks/setup.py Updated ✅ Added numpy>=2,!=2.3.5 constraint
source/isaaclab_visualizers/setup.py Updated ✅ Added numpy>=2,!=2.3.5 constraint
isaaclab_rl/changelog.d/... Added ✅ Changelog entry
isaaclab_tasks/changelog.d/... Added ✅ Changelog entry
isaaclab_visualizers/changelog.d/... Added ✅ Changelog entry

🔍 Changes Analysis

  1. Defense in Depth: The !=2.3.5 exclusion in all setup.py files adds another layer of protection by explicitly blocking the broken numpy version from pip's intermediate resolutions, not just forcing an upgrade at the end.

  2. Consistent Constraint Pattern: All subpackages now use consistent numpy>=2,!=2.3.5 specification with comments pointing to the main explanation in source/isaaclab/setup.py.

  3. Comprehensive Changelogs: Each affected package has its own changelog entry documenting the fix.

✅ Verdict

Excellent additions. This defense-in-depth approach is a good improvement over relying solely on the post-install upgrade. By excluding numpy 2.3.5 from the package requirements, pip will actively avoid it during resolution rather than potentially installing it and then upgrading. This makes the fix more robust across different installation scenarios. LGTM.

🤖 Update by isaaclab-review-bot


Update (eae5a01): Reviewed changes since 6c1ad76:

🔄 Strategy Change: From Forced Upgrade to Exclusion

The approach has been simplified from forcing numpy>=2.4.1 upgrade to exclusively using numpy!=2.3.5 exclusions across all install paths.

📁 Changes Summary

File Change Assessment
docker/Dockerfile.base ARM pre-install: numpynumpy!=2.3.5 ✅ Consistent exclusion
docker/Dockerfile.curobo Removed post-install upgrade, added numpy!=2.3.5 to curobo install ✅ Cleaner approach
source/isaaclab/install.py Passes numpy!=2.3.5 in pip commands instead of forcing upgrade ✅ Less aggressive
source/isaaclab_mimic/setup.py Added numpy!=2.3.5 ✅ New coverage
source/isaaclab_teleop/setup.py Added numpy!=2.3.5 ✅ New coverage
source/isaaclab/changelog.d/... Updated with comprehensive list of exclusion sites ✅ Thorough docs

🔍 Analysis

This is a cleaner approach compared to the previous force-upgrade strategy:

  1. Less Intrusive: Exclusion (!=2.3.5) lets pip choose any valid numpy version rather than forcing a specific minimum (>=2.4.1)
  2. Better Compatibility: Won't conflict with other constraints that might cap numpy below 2.4
  3. Consistent Pattern: All sites now use the same numpy!=2.3.5 pattern
  4. Comprehensive Coverage: Added exclusions to previously-missed packages (isaaclab_mimic, isaaclab_teleop)

✅ Verdict

LGTM - The simplified exclusion-only approach is more robust and less likely to cause resolver conflicts. Good decision to pivot from the upgrade-forcing strategy. All install paths are now covered with the !=2.3.5 exclusion.

🤖 Update by isaaclab-review-bot

@hujc7 hujc7 force-pushed the jichuanh/force-numpy-install branch from beab776 to 1499d08 Compare May 16, 2026 21:35
@hujc7 hujc7 force-pushed the jichuanh/force-numpy-install branch from 1499d08 to 6c1ad76 Compare May 16, 2026 21:44
numpy 2.3.5 ships a vendored OpenBLAS
(libscipy_openblas64_-fdde5778.so) whose pthread_atfork handler crashes
Kit's libomni.platforminfo fork() during SimulationApp startup. The
release is excluded at every site that pulls numpy directly or
transitively, so no pip resolve during isaaclab.sh --install or any
Docker image build can land on it -- even transiently:

  source/isaaclab/setup.py
  source/isaaclab_tasks/setup.py
  source/isaaclab_rl/setup.py
  source/isaaclab_visualizers/setup.py
  source/isaaclab_teleop/setup.py        (transitive via dex-retargeting)
  source/isaaclab_mimic/setup.py         (transitive via h5py)
  isaaclab.cli.commands.install._ensure_pink_ik_dependencies_installed
  isaaclab.cli.commands.install._maybe_preinstall_arm_nlopt
  docker/Dockerfile.base                 (ARM nlopt prep)
  docker/Dockerfile.curobo               (ARM nlopt prep + nvidia-curobo install)

Each touchpoint adds only the ``!=2.3.5`` exclusion; no other version
constraints are introduced.

Validated:
- env_isaaclab_test smoke test (numpy 2.4.5 + cmeel pinocchio + pink + daqp
  + qpsolvers all import; toy IK solve OK).
- IsaacLab Pink IK unit tests: 54/54 pass against numpy 2.4.5.
- PR isaac-sim#5655 worst-case run (diagnostic imports numpy before pytest spawns
  Kit, the order that originally crashed): 36 pass / 0 fail. The
  isaaclab_physx surface gripper SIGSEGV is gone.

Related: numpy/numpy#30092, OpenMathLib/OpenBLAS#5520
@hujc7 hujc7 force-pushed the jichuanh/force-numpy-install branch from 6c1ad76 to eae5a01 Compare May 18, 2026 03:47
@github-actions github-actions Bot added the isaac-mimic Related to Isaac Mimic team label May 18, 2026
@hujc7 hujc7 changed the title [Fix] Force numpy>=2.4.1 in --install to escape OpenBLAS atfork SIGSEGV [Fix] Exclude numpy 2.3.5 from every IsaacLab install path May 18, 2026
@hujc7 hujc7 marked this pull request as ready for review May 18, 2026 23:40
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR adds numpy!=2.3.5 at every install site that pulls numpy in IsaacLab, preventing the broken 2.3.5 release (whose vendored OpenBLAS registers a faulty pthread_atfork handler that crashes Kit's fork() during SimulationApp startup) from being resolved at any step of isaaclab.sh --install or Docker image builds.

  • 10 install sites covered: six setup.py files, two locations in install.py (_maybe_preinstall_arm_nlopt and _ensure_pink_ik_dependencies_installed), and two Dockerfile steps (Dockerfile.base ARM nlopt prep, Dockerfile.curobo nvidia-curobo install).
  • Point-in-time exclusion: the !=2.3.5 constraint is a precise exclusion; any future numpy release with the same OpenBLAS regression would require a new entry at each of these sites. The PR explicitly acknowledges this risk.

Confidence Score: 5/5

Safe to merge. All ten install sites are consistently updated with the !=2.3.5 exclusion, and the final resolved numpy version (2.3.4) is validated by CI and Pink IK unit tests.

The change is narrow and surgical — it adds a single version exclusion string at every install site that can pull numpy. The pre-existing state of some setup.py files (no >=2 lower bound on isaaclab_rl, isaaclab_visualizers, isaaclab_mimic, isaaclab_teleop) is unchanged and not introduced by this PR; those packages are always co-installed with isaaclab which enforces numpy>=2. The force-reinstall path in _ensure_pink_ik_dependencies_installed is correct — the combined constraint from pin-pink / cmeel-boost (numpy<2.4) and the new !=2.3.5 exclusion pins resolution to 2.3.4. All CI checks are green and 54/54 Pink IK tests pass.

No files require special attention. All ten changed install sites apply the exclusion consistently.

Important Files Changed

Filename Overview
source/isaaclab/setup.py Adds !=2.3.5 to the existing numpy>=2 constraint, keeping the lower-bound guard intact. Includes a detailed comment explaining the rationale.
source/isaaclab/isaaclab/cli/commands/install.py Two pip invocations updated: ARM nlopt pre-install and Pink IK force-reinstall. Both now pass numpy!=2.3.5 so isolated resolves cannot land on the broken release. Comments added explaining the rationale at both sites.
docker/Dockerfile.base ARM64 numpy pre-install step now uses numpy!=2.3.5 instead of bare numpy, correctly mirroring the install.py change.
docker/Dockerfile.curobo ARM nlopt step updated (same as base) and numpy!=2.3.5 added to the nvidia-curobo install invocation to guard against transitive numpy resolution through cuRobo's deps.
source/isaaclab_mimic/setup.py Adds numpy!=2.3.5 as a new explicit entry (h5py transitive path). No lower bound added, consistent with the PR's stated policy of not changing existing constraints.
source/isaaclab_teleop/setup.py Adds numpy!=2.3.5 as a new explicit entry guarding the dex-retargeting -> pin -> cmeel-boost transitive path.
source/isaaclab_rl/setup.py Upgrades the bare numpy dep to numpy!=2.3.5. No lower bound added (pre-existing state), consistent with the PR's stated scope.
source/isaaclab_tasks/setup.py Extends numpy>=2 to numpy>=2,!=2.3.5, preserving the lower bound. Clean change.
source/isaaclab_visualizers/setup.py Upgrades bare numpy dep to numpy!=2.3.5. The isaaclab dependency in this file already pins numpy>=2,!=2.3.5, so the combined resolution is effectively numpy>=2,!=2.3.5.

Reviews (1): Last reviewed commit: "[Fix] Exclude numpy 2.3.5 from every Isa..." | Re-trigger Greptile

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Review Summary

PR #5656: [Fix] Exclude numpy 2.3.5 from every IsaacLab install path
Commit: eb676ca

✅ Overview

This PR addresses a critical SIGSEGV crash caused by numpy 2.3.5's vendored OpenBLAS (libscipy_openblas64_-fdde5778.so) registering a buggy pthread_atfork handler that crashes Kit's fork() during SimulationApp startup. The fix adds numpy!=2.3.5 exclusions at every install site that pulls numpy.


🔍 Key Findings

✅ Strengths

  1. Comprehensive Coverage: All 10 install paths are addressed:

    • 6 setup.py files (isaaclab, isaaclab_tasks, isaaclab_rl, isaaclab_visualizers, isaaclab_teleop, isaaclab_mimic)
    • 2 install.py locations (_maybe_preinstall_arm_nlopt, _ensure_pink_ik_dependencies_installed)
    • 2 Dockerfile sites (Dockerfile.base ARM prep, Dockerfile.curobo ARM + cuRobo install)
  2. Correct Design Rationale: The PR correctly identifies that pip's resolver runs per-invocation, meaning constraints from already-installed packages are not enforced in subsequent resolves. The exclusion must appear at every numpy-pulling site.

  3. Minimal Invasiveness: Using !=2.3.5 exclusion rather than forcing >=2.4 avoids conflicts with cmeel-boost's numpy<2.4 cap, letting pip land on 2.3.4 naturally.

  4. Thorough Documentation:

    • Comprehensive PR description explaining the root cause and design decisions
    • Inline comments at each modification site referencing the main rationale
    • Proper changelog entries for each affected package
  5. Validated Approach: 54/54 Pink IK tests pass locally, Docker build logs audited to confirm no 2.3.5 installation, CI green.

⚠️ Acknowledged Maintenance Considerations

The PR honestly documents these ongoing maintenance requirements:

  • New IsaacLab submodules with numpy deps must include the exclusion
  • New Dockerfile pip lines pulling numpy need the constraint
  • If numpy 2.3.6+ introduces a new regression in the 2.3.x line, additional exclusions would be needed

These are acceptable trade-offs for a targeted, surgical fix.


📋 Files Review

Component Files Assessment
Core isaaclab setup.py, install.py ✅ Correct constraint placement with detailed comments
Sub-packages 5 additional setup.py ✅ Consistent pattern, appropriate cross-references
Docker Dockerfile.base, Dockerfile.curobo ✅ Mirrors Python install paths correctly
Changelogs 6 .rst files ✅ Proper format, clear explanations

🧪 Test Coverage Assessment

Existing validation:

  • Local Pink IK unit tests: 54/54 pass against numpy 2.4.5 (demonstrates ABI compatibility)
  • Docker build audit: numpy 2.3.5 never installed, final state is 2.3.4
  • PR #5655 diagnostic run: confirms the atfork crash is resolved
  • CI: pre-commit, wheel build passing

Risk mitigation: The crash is immediately visible (SIGSEGV at startup), so any regression would be caught quickly rather than silently degrading functionality.


📌 Verdict

LGTM — This is a well-executed fix for a critical crash. The exclusion-based approach is appropriately surgical, the coverage is comprehensive across all install paths, and the documentation is thorough. The maintenance burden acknowledged by the PR is an acceptable trade-off for avoiding conflicts with upstream dependency caps.


🤖 Review generated by isaaclab-review-bot

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Review Summary

PR #5656: [Fix] Exclude numpy 2.3.5 from every IsaacLab install path
Commit: b04f0da

✅ Overview

This PR addresses a critical SIGSEGV crash caused by numpy 2.3.5's vendored OpenBLAS (libscipy_openblas64_-fdde5778.so) registering a buggy pthread_atfork handler that crashes Kit's fork() during SimulationApp startup. The fix adds numpy!=2.3.5 exclusions at every install site that pulls numpy.


🔍 Key Findings

✅ Strengths

  1. Comprehensive Coverage: All 10 install paths are addressed:

    • 6 setup.py files (isaaclab, isaaclab_tasks, isaaclab_rl, isaaclab_visualizers, isaaclab_teleop, isaaclab_mimic)
    • 2 install.py locations (_maybe_preinstall_arm_nlopt, _ensure_pink_ik_dependencies_installed)
    • 2 Dockerfile sites (Dockerfile.base ARM prep, Dockerfile.curobo ARM + cuRobo install)
  2. Correct Design Rationale: The PR correctly identifies that pip's resolver runs per-invocation, meaning constraints from already-installed packages are not enforced in subsequent resolves. The exclusion must appear at every numpy-pulling site.

  3. Minimal Invasiveness: Using !=2.3.5 exclusion rather than forcing >=2.4 avoids conflicts with cmeel-boost's numpy<2.4 cap, letting pip land on 2.3.4 naturally.

  4. Thorough Documentation:

    • Comprehensive PR description explaining the root cause and design decisions
    • Inline comments at each modification site referencing the main rationale
    • Proper changelog entries for each affected package
  5. Validated Approach: 54/54 Pink IK tests pass locally, Docker build logs audited to confirm no 2.3.5 installation, CI green.

⚠️ Observations

  1. Maintenance Considerations (minor): The PR honestly documents ongoing maintenance requirements—new submodules must include the exclusion. This is an acceptable trade-off for a targeted fix.

  2. No Automated Version Guard (nit): There's no automated CI test asserting numpy != 2.3.5 at runtime. The PR's §8.1 proposes such a test for a future numpy>=2.4 upgrade but defers it as out of scope. Given that the crash is immediately visible (SIGSEGV), this is acceptable for now.

  3. Transitive Dependency Paths Well-Mapped (positive): The PR correctly traces transitive paths—isaaclab_teleop via dex-retargeting→pin→cmeel-boost, isaaclab_mimic via h5py—and adds the exclusion at each terminal.


📋 Files Review

Component Files Assessment
Core isaaclab setup.py, install.py ✅ Correct constraint placement with detailed comments
Sub-packages 5 additional setup.py ✅ Consistent pattern, appropriate cross-references
Docker Dockerfile.base, Dockerfile.curobo ✅ Mirrors Python install paths correctly
Changelogs 6 .rst files ✅ Proper format, clear explanations

🧪 Test Coverage Assessment

Existing validation:

  • Local Pink IK unit tests: 54/54 pass against numpy 2.4.5 (demonstrates ABI compatibility)
  • Docker build audit: numpy 2.3.5 never installed, final state is 2.3.4
  • PR #5655 diagnostic run: confirms the atfork crash is resolved
  • CI: pre-commit, wheel build, changelog checks passing

Risk mitigation: The crash is immediately visible (SIGSEGV at startup), so any regression would be caught quickly rather than silently degrading functionality.


📌 Verdict

LGTM — This is a well-executed fix for a critical crash. The exclusion-based approach is appropriately surgical, the coverage is comprehensive across all install paths, and the documentation is thorough. The maintenance burden acknowledged by the PR is an acceptable trade-off for avoiding conflicts with upstream dependency caps.


🤖 Review generated by isaaclab-review-bot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infrastructure isaac-lab Related to Isaac Lab team isaac-mimic Related to Isaac Mimic team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant