[paddle-adapt] gemm: adapt tests/gemm/test_group_gemm.py, test_mm_bf16.py, test_bmm_bf16.py#17
Open
BingooYang wants to merge 3 commits into
Open
[paddle-adapt] gemm: adapt tests/gemm/test_group_gemm.py, test_mm_bf16.py, test_bmm_bf16.py#17BingooYang wants to merge 3 commits into
BingooYang wants to merge 3 commits into
Conversation
added 3 commits
May 15, 2026 10:51
- Add paddle.enable_compat() and monkey-patches to tests/conftest.py: - Stream.cuda_stream property (paddle uses __cuda_stream__() returning tuple) - torch.cuda.current_blas_handle (paddle.cuda lacks this API) - Fix torch.device(device=...) -> torch.device(...) across test files - Add __is_paddle_compatible_library__ = True to flashinfer/__init__.py - Add use_paddle_compatible_api() helper to flashinfer/utils.py - Make flashinfer/triton imports optional (triton may not be available) - Add _CudaOutOfMemoryError sentinel in flashinfer/autotuner.py - Fix _get_cuda_stream() in cutlass/torch.py for paddle compat - Rename package to flashinfer-python-paddle in pyproject.toml Test results: - test_group_gemm.py: 288 passed, 360 skipped - test_mm_bf16.py: 1081 passed (cudnn/auto failures due to libcudart env conflict) - test_bmm_bf16.py: 32 passed (cudnn/auto failures due to libcudart env conflict) Known limitations (not adaptation issues): - cudnn/auto backend: libcudart.so.12 vs .13 conflict (environment issue) - res_dtype != bfloat16: paddle tensor copy between different dtypes not supported
…m_bf16 under paddle compat - test_group_gemm.py: sm80 backend 288 PASS, 36 SKIP (batch_size*rows>8192); sm90 SKIP (SM100 device, no sm90 GEMM support); zero code changes needed - test_mm_bf16.py: adapted via §35 fix (torch.device kwarg -> positional); cutlass/tgv/cublaslt/tinygemm backends pass; cudnn/auto-float32 FAIL due to §47 env issue (Multiple libcudart.so.12 vs .so.13) - test_bmm_bf16.py: adapted via §35; cutlass backend pass; auto+float32 FAIL due to §47 - Regression: norm PASS (102+35 cases), comm PASS, cherry-picked base fixes from c11b6f55 Refs: adaptation-paddle/adaptation_exp.md §35 §47
- Replace try/import paddle with importlib.util.find_spec() in utils.py - Apply ruff-format to 5 modified files
a7265ce to
3af530f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📌 Description
Adapt three GEMM tests to run under paddle.enable_compat() mode.
Changes
Test Results
Known Non-Paddle Issues
🔍 Related Issues
Part of the Paddle compatibility adaptation series.
🚀 Pull Request Checklist
✅ Pre-commit Checks
🧪 Tests
Reviewer Notes
test_group_gemm.py required zero code changes — sm80 backend passes with base conftest.py patches alone. cudnn/auto failures are §47 env issue, unrelated to Paddle adaptation.