Drain in-flight graphics tasks before NativeEngine teardown#1746
Open
bkaradzic-microsoft wants to merge 2 commits into
Open
Drain in-flight graphics tasks before NativeEngine teardown#1746bkaradzic-microsoft wants to merge 2 commits into
bkaradzic-microsoft wants to merge 2 commits into
Conversation
NativeEngine::Dispose only cancelled the cancellation source, which short-circuits task scheduling but does not stop a threadpool task already executing its body. On teardown an in-flight loadTexture, loadProgram, or loadCubeTexture body could touch a texture or bgfx context that had already been freed, causing an access violation. Add an AsyncTaskTracker whose RAII Scope token is captured into the four graphics-touching threadpool lambdas. Dispose now cancels and then waits for those tokens to drain before returning, so teardown can't free resources out from under a running task. The token decrements on the threadpool thread, so Dispose (JS thread) never deadlocks on its own continuations. Also add a cancellation early-out to the load bodies to skip work once teardown has begun. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens NativeEngine shutdown by ensuring any in-flight threadpool work that touches bgfx/graphics resources has completed before teardown proceeds, preventing use-after-free crashes during app/test shutdown.
Changes:
- Added an
AsyncTaskTracker(counter +condition_variable) andTrackAsyncTask()RAII token to track graphics-touching async work. - Updated
NativeEngine::Dispose()to cancel work and then block until tracked tasks drain. - Captured tracking tokens into
loadProgram,loadTexture, and cube-texture inline continuations; added cancellation early-outs for the two threadpool bodies.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| Plugins/NativeEngine/Source/NativeEngine.h | Introduces AsyncTaskTracker + TrackAsyncTask() and stores the tracker on NativeEngine. |
| Plugins/NativeEngine/Source/NativeEngine.cpp | Implements tracker logic, waits during Dispose(), and wires tracking/cancellation checks into async graphics load paths. |
Adds a deterministic UnitTests regression test (Tests.NativeEngine.Teardown.cpp) for the loadTexture async-teardown race. NativeEngine gains a test-only, BABYLON_NATIVE_BUILD_APPS-gated JS method (_disposeDrainTestSchedule) that schedules a tracked threadpool task -- the same TrackAsyncTask mechanism the async texture/shader loaders use -- which signals start, sleeps, then signals finish without touching any graphics resources. The test lets that task get in flight, disposes the engine, and asserts the task finished by the time Dispose returned. With the drain in place Dispose blocks until the task completes (pass); without it Dispose returns early (fail). Because the task touches nothing, the test is deterministic and never relies on undefined behaviour. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
NativeEngine::Disposeonly calledm_cancellationSource->cancel(). arcana cancellation short-circuits work at scheduling boundaries, but a task body that is already running on a threadpool thread keeps going to completion.The graphics-loading entry points (
loadProgram,loadTexture,loadCubeTexture) schedule their work as threadpool tasks that capture a rawGraphics::Texture*/ call into bgfx. On app teardown, if the suite finishes while one of these bodies is mid-flight,Disposecancels and teardown proceeds to free the texture / tear down the bgfx context — and the still-running task then touches freed resources, causing an access violation:This reproduces at shutdown, after the test run reports it finished.
Fix
Add a small
AsyncTaskTracker(mutex + condition_variable + counter) whose RAIIScopetoken is captured into the four graphics-touching threadpool lambdas (theloadProgrambody, theloadTexturebody, and the twoLoadCubeTextureFromImagesinline continuations). The CPU-only image-parse bodies are intentionally not tracked.Disposenow cancels and then waits for those tokens to drain before returning, so teardown can't free graphics resources out from under a running task. The token decrements on the threadpool thread (never on a JS-thread.thencontinuation), soDispose— which runs on the JS thread — can never deadlock waiting on its own continuations. Counting is bound to the token's lifetime, so the increment/decrement pair can't be split by an exception.A defensive
if (cancellationSource->cancelled()) return;early-out is also added at the top of the two threadpool load bodies, so once teardown has begun they skip touching graphics resources entirely.Testing
NativeEngineandPlaygroundbuild clean (RelWithDebInfo, x64, D3D11).Spritestest (loadTexturepath) and a cubemap test (LoadCubeTextureFromImagespath) headless — both exercise the load paths and tear down promptly with no hang and no crash.