Fix intermittent dali failure#1687
Conversation
Greptile SummaryThis PR fixes an intermittent SIGABRT in DALI-backed datapipe tests by building the datapipe iterator once (
Important Files Changed
Reviews (1): Last reviewed commit: "Merge branch 'main' into fix-intermitten..." | Re-trigger Greptile |
|
/blossom-ci |
peterdsharpe
left a comment
There was a problem hiding this comment.
@ktangsali are you able to consistently reproduce this, or is it intermittent? If it's consistently reproducible, can we add a non-regression test as part of the test suite here?
Yes, I did test it for intermittency. Test: earlier pytest test/datapipes would fail ~5/16 times when I tested locally. Post fix, was able to get 16/16 pass rate. What do you mean by non-regression test here? |
PhysicsNeMo Pull Request
Description
check_cuda_graphscallednext(iter(datapipe))inside its warmup/record/replay loops, which re-enteredDatapipe.__iter__every step and reset the underlying DALI pipeline 8 times per test, exposing a race in DALI's multiprocessing-pool_observer_threadthat intermittently aborted the interpreter.Fix: build the iterator once outside the loops (
data_iter = iter(datapipe)) and advance it withnext(data_iter), collapsing 8 resets into 1 and closing the race window.Test: earlier
pytest test/datapipeswould fail ~5/16 times when I tested locally. Post fix, was able to get 16/16 pass rate.Checklist
Dependencies
Review Process
All PRs are reviewed by the PhysicsNeMo team before merging.
Depending on which files are changed, GitHub may automatically assign a maintainer for review.
We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.
AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.