Skip to content

Carb crash#5659

Closed
pbarejko wants to merge 7 commits into
isaac-sim:developfrom
pbarejko:pbarejko/crash
Closed

Carb crash#5659
pbarejko wants to merge 7 commits into
isaac-sim:developfrom
pbarejko:pbarejko/crash

Conversation

@pbarejko
Copy link
Copy Markdown
Collaborator

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context.
List any dependencies that are required for this change.

Fixes # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (existing functionality will not work without user modification)
  • Documentation update

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label May 17, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 17, 2026

Greptile Summary

This PR adds faulthandler instrumentation at the top of test_newton_schemas.py, apparently to help diagnose a hanging or crashing test run.

  • faulthandler.enable is a reasonable safety net for catching segfaults, but faulthandler.dump_traceback_later with repeat=True and a 1-second interval will print full thread stack traces to stderr every second for the entire test run, and faulthandler.cancel_dump_traceback_later() is never called to stop the timer.
  • No other changes were made; the test logic itself is unchanged. This appears to be temporary debugging scaffolding that was not meant to be merged.

Confidence Score: 3/5

Not safe to merge — contains leftover debug instrumentation that will continuously dump thread stack traces to stderr throughout every test run.

The only change is module-level faulthandler code that fires a repeating 1-second timer dumping all thread tracebacks to stderr. The timer is never cancelled, so it runs for the entire test session, making CI logs unreadable. This is clearly diagnostic scaffolding from a debugging session that was not cleaned up before opening the PR.

source/isaaclab_newton/test/sim/test_newton_schemas.py — the faulthandler block at lines 8–17 needs to be removed or, at minimum, the dump_traceback_later call needs to be dropped.

Important Files Changed

Filename Overview
source/isaaclab_newton/test/sim/test_newton_schemas.py Adds module-level faulthandler debugging code: faulthandler.enable (reasonable) plus dump_traceback_later with a 1-second repeating timer that is never cancelled, flooding stderr throughout the test run.

Reviews (1): Last reviewed commit: "Test" | Re-trigger Greptile

Comment on lines +12 to +17
faulthandler.dump_traceback_later(
1.0,
repeat=True,
file=sys.__stderr__,
exit=False,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unintentional debug code — floods stderr every second

faulthandler.dump_traceback_later with repeat=True and a 1-second timeout will dump full thread stack traces to stderr every second for the entire test run. Because faulthandler.cancel_dump_traceback_later() is never called — not even in the setup_sim teardown — the timer runs until the process exits, producing a continuous wall of noise in CI logs. This looks like diagnostic scaffolding added while investigating a hang, and should be removed before merging.

@pbarejko pbarejko marked this pull request as draft May 17, 2026 03:28
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Automated Code Review

Summary

This PR appears to be a debugging/testing PR that makes significant changes to the CI pipeline for troubleshooting test_newton_schemas.py. These changes should not be merged to develop in their current state.


🚨 Critical Issues

1. CI Pipeline Disabled

Severity: CRITICAL | Files: .github/workflows/build.yaml

22 CI jobs are disabled by setting if: false:

  • All test jobs (isaaclab-core, isaaclab-assets, isaaclab-devices, etc.)
  • Docker build jobs (build-curobo-docker-image)
  • cuRobo test jobs
  • Training and benchmark jobs

Impact: Merging this would disable the entire CI test suite on the develop branch, allowing untested code to be merged.

Suggestion: These changes should remain on a feature branch for debugging only, or be reverted before merging.


2. Debug Code in Test File ✅ RESOLVED

Severity: HIGH RESOLVED | Files: source/isaaclab_newton/test/sim/test_newton_schemas.py

The faulthandler debugging instrumentation was present.

Update (fb867a0): Faulthandler code has been commented out. This issue is now resolved.


⚠️ Moderate Issues

3. Test Scope Narrowed

Severity: MEDIUM | Files: .github/workflows/build.yaml (line 592)

The Newton test job now includes include-files: "test_newton_schemas.py", which limits test execution to only this file. This significantly reduces test coverage for the isaaclab_newton package.


📋 Process Issues

Issue Status
PR Description ❌ Template not filled out
Linked Issue ❌ No issue referenced
Contribution Checklist ❌ All items unchecked

💡 Recommendations

  1. Do not merge this PR to develop - The CI disabling changes would break the repository's testing infrastructure
  2. If debugging is needed, consider:
    • Creating a draft PR
    • Adding [WIP] or [DO NOT MERGE] to the title
    • Using a separate branch for CI experiments
  3. Once debugging is complete, create a new PR with only the actual fix

📊 Progress Tracker

Issue Status
CI pipeline disabled (22 jobs) ❌ Still present
Debug faulthandler code ✅ Commented out (fb867a0)
Test scope narrowed ❌ Still present
Telemetry disabled ⚠️ Still present (no justification)
Extra pytest verbosity ⚠️ Still present

This is an automated review. 1 of 3 critical/high issues has been addressed. Please address remaining issues before merging.

@github-actions github-actions Bot added the isaac-sim Related to Isaac Sim team label May 17, 2026
@pbarejko pbarejko changed the title Test Forcing carb crash May 18, 2026
@pbarejko pbarejko changed the title Forcing carb crash Carb crash May 18, 2026
pbarejko added 2 commits May 18, 2026 13:49
This reverts commit 70b2a25.
This reverts commit d537628.
@isaaclab-review-bot
Copy link
Copy Markdown


Update (9e89c4f): Commits 717f0a6e and 9e89c4fa revert the telemetry disabling and verbose pytest flag changes. This is good progress! However, the core issues remain:

Current Status

Issue Status
22 CI jobs disabled ❌ Still present
Debug faulthandler code ❌ Still present
Test scope narrowed (include-files) ❌ Still present
Telemetry disabled Reverted
Verbose pytest flag Reverted

Remaining Concerns

  1. CI Pipeline: All 22 jobs are still disabled with if: false. This PR should not be merged to develop in this state.

  2. Faulthandler Instrumentation: The periodic traceback dumping (every 1 second) is still present in test_newton_schemas.py. While useful for debugging, this should be removed before merge.

  3. Narrowed Test Scope: The include-files: "test_newton_schemas.py" limits the Newton tests. Other Newton tests are not running.

Recommendation

This appears to be an active debugging PR. Consider:

  • Marking as Draft PR until the investigation completes
  • Adding [WIP] or [DO NOT MERGE] to the title
  • Once the root cause is found, create a separate PR with the actual fix

Core issues from previous review remain unaddressed. PR not ready for merge.

@pbarejko pbarejko closed this May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infrastructure isaac-lab Related to Isaac Lab team isaac-sim Related to Isaac Sim team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant