Skip to content

[custom op] Drive unitary-synthesis with the dialect-conversion framework#4751

Open
thedaemon-wizard wants to merge 1 commit into
NVIDIA:mainfrom
thedaemon-wizard:conversion-rewriter-synthesis
Open

[custom op] Drive unitary-synthesis with the dialect-conversion framework#4751
thedaemon-wizard wants to merge 1 commit into
NVIDIA:mainfrom
thedaemon-wizard:conversion-rewriter-synthesis

Conversation

@thedaemon-wizard

Copy link
Copy Markdown
Contributor

[custom op] Drive unitary-synthesis with the dialect-conversion framework

Follow-up to #4693 (recursive Quantum Shannon Decomposition for 3–5 qubit custom
operations, which closed #2242). As discussed there
(#4693 (comment)), the
conversion-rewriter improvement was deferred to a separate PR — this is that PR.

Summary

The unitary-synthesis pass
(lib/Optimizer/Transforms/UnitarySynthesis.cpp) is reworked from a greedy rewrite
driver (applyPatternsGreedily + OpRewritePattern) to the dialect-conversion
framework (ConversionTarget + applyPartialConversion + OpConversionPattern).

This is a driver-only change: the decomposition math and the emitted gate sequences
are unchanged, so every existing UnitarySynthesis test passes as-is.

Motivation

With a conversion target, the "do not commit a rewrite unless the legality constraints are
met" guarantee is expressed by the framework rather than by hand. Previously the pass
enforced composability by validating the whole decomposition before emitting and
returning failure() to leave the IR untouched. With dialect conversion, a
quake.custom_unitary_constant is simply marked illegal (must be rewritten) exactly
when a decomposed replacement kernel can be produced for it; everything else stays
legal and is left untouched. The pass remains composable — unsupported dimensions,
non-unitary matrices, and the rare reconstruction miss are left unchanged with a warning /
LLVM_DEBUG note, never a signalPassFailure or a crash.

What changed

  • Driver: runOnOperation now builds a ConversionTarget and runs a single
    module-level applyPartialConversion. CustomUnitaryPattern is an
    OpConversionPattern<quake::CustomUnitaryConstantOp> whose body is a pure structural
    replacement to quake.apply.
  • Legality predicate: a custom op is illegal iff a decomposed kernel exists for its
    generator; this is what enforces composability.
  • Kernel pre-generation: the decomposed replacement kernels are generated once per
    generator with a plain OpBuilder, walking the custom ops in reverse so the kernels keep
    their original module order independent of the conversion driver's traversal order. Each
    generator matrix is classified/decomposed at most once via a small cache.
  • API: emitDecomposedFuncOp / the gray-code emitMux now take an OpBuilder & so the
    kernels are built outside the conversion process.

The design intentionally does not rely on conversion rollback: kernels are
pre-generated and a custom op is only rewritten once it is known-legalizable, so there is
nothing to undo. This matches the direction of MLIR's One-Shot Dialect Conversion driver,
where ConversionConfig::allowPatternRollback now defaults to false
(llvm/llvm-project#151865).

Verification

Built cudaq-opt against LLVM and ran every test/Transforms/UnitarySynthesis/*.qke
with its RUN lines:

  • All FileCheck + CircuitCheck RUN lines pass (19/19 across the 11 tests) — the 8
    pre-existing ZYZ/KAK tests, the 3-qubit random_unitary-5, and the CircuitCheck-only
    4-qubit/5-qubit random_unitary-6/-7. Because the emitted IR is unchanged, every
    FileCheck still matches and CircuitCheck still confirms unitary equivalence.
  • Composability: feeding a power-of-two dim = 64 (6-qubit) op and a non-unitary
    supported-dimension op through cudaq-opt --unitary-synthesis leaves the
    quake.custom_unitary_constant unchanged with returncode 0 (the dim-64 case prints the
    LLVM_DEBUG cap note; the non-unitary case prints the "matrix must be unitary" warning).
    No signalPassFailure, no crash.
  • clang-format clean.

AI usage disclosure

This change was prepared with AI assistance (Claude) for code refactoring, test execution,
and drafting; all changes were reviewed and verified locally by the author.

…work

Follow-up to NVIDIA#4693 (recursive QSD for 3-5 qubit custom operations): rework the
unitary-synthesis pass from a greedy rewrite driver (applyPatternsGreedily +
OpRewritePattern) to the dialect-conversion framework (ConversionTarget +
applyPartialConversion + OpConversionPattern), as discussed during that review.

With a conversion target the "do not commit unless legal" guarantee is enforced
by the legality predicate instead of by validating-before-emitting inside the
pattern: a quake.custom_unitary_constant is illegal (must be rewritten) exactly
when a decomposed replacement kernel exists, and every other custom op stays
legal and is left untouched. The pass therefore remains composable -- unsupported
dimensions, non-unitary matrices, and the rare reconstruction miss are left
unchanged with a warning / LLVM_DEBUG note, never a signalPassFailure or crash.

The synthesized circuit is unchanged. The decomposed replacement kernels are
pre-generated once per generator (walking the custom ops in reverse so the
kernels keep their original module order regardless of the conversion driver's
traversal order) and each generator matrix is classified/decomposed at most once
via a small cache. emitDecomposedFuncOp now takes an OpBuilder so the kernels can
be built outside the conversion process.

No behavioral change to the decomposition math or emitted gates; all
UnitarySynthesis FileCheck and CircuitCheck tests pass unchanged.

Signed-off-by: thedaemon-wizard <amon.koike@daemons.jp>
@copy-pr-bot

copy-pr-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@khalatepradnya

khalatepradnya commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

/ok to test 9e69867

Command Bot: Processing...

@github-actions

Copy link
Copy Markdown

CI Summary (push) — ✅ passed

Run #27927669329 · ✅ 6 · ⏩ 7 · ❌ 0 · ⛔ 0

Top-level jobs (13)
Job Result
binaries ⏩ skipped
build_and_test ✅ success
config_devdeps ✅ success
config_source_build ⏩ skipped
config_wheeldeps ✅ success
devdeps ✅ success
docker_image ⏩ skipped
gen_code_coverage ⏩ skipped
metadata ✅ success
python_metapackages ⏩ skipped
python_wheels ⏩ skipped
source_build ⏩ skipped
wheeldeps ✅ success
⏩ Skipped jobs (7) — intentionally skipped on PR builds; run on merge_group / workflow_dispatch
Job
binaries
config_source_build
docker_image
gen_code_coverage
python_metapackages
python_wheels
source_build
All sub-jobs (42) — every matrix leg, with links
Job Status Link
Build and test (amd64, gcc12, openmpi) / Dev environment (Debug) ✅ success view
Build and test (amd64, gcc12, openmpi) / Dev environment (Python) ✅ success view
Build and test (amd64, llvm, openmpi) / Dev environment (Debug) ✅ success view
Build and test (amd64, llvm, openmpi) / Dev environment (Python) ✅ success view
Build and test (arm64, llvm, openmpi) / Dev environment (Debug) ✅ success view
Build and test (arm64, llvm, openmpi) / Dev environment (Python) ✅ success view
CI Summary ❔ in_progress view
Configure build (devdeps) ✅ success view
Configure build (source_build) ⏩ skipped view
Configure build (wheeldeps) ✅ success view
Create CUDA Quantum installer ⏩ skipped view
Create Docker images ⏩ skipped view
Create Python metapackages ⏩ skipped view
Create Python wheels ⏩ skipped view
Gen code coverage ⏩ skipped view
Load dependencies (amd64, gcc12) / Caching ✅ success view
Load dependencies (amd64, gcc12) / Finalize ✅ success view
Load dependencies (amd64, gcc12) / Metadata ✅ success view
Load dependencies (amd64, llvm) / Caching ✅ success view
Load dependencies (amd64, llvm) / Finalize ✅ success view
Load dependencies (amd64, llvm) / Metadata ✅ success view
Load dependencies (arm64, gcc12) / Caching ✅ success view
Load dependencies (arm64, gcc12) / Finalize ✅ success view
Load dependencies (arm64, gcc12) / Metadata ✅ success view
Load dependencies (arm64, llvm) / Caching ✅ success view
Load dependencies (arm64, llvm) / Finalize ✅ success view
Load dependencies (arm64, llvm) / Metadata ✅ success view
Load source build cache ⏩ skipped view
Load wheel dependencies (amd64, 12.6) / Caching ✅ success view
Load wheel dependencies (amd64, 12.6) / Finalize ✅ success view
Load wheel dependencies (amd64, 12.6) / Metadata ✅ success view
Load wheel dependencies (amd64, 13.0) / Caching ✅ success view
Load wheel dependencies (amd64, 13.0) / Finalize ✅ success view
Load wheel dependencies (amd64, 13.0) / Metadata ✅ success view
Load wheel dependencies (arm64, 12.6) / Caching ✅ success view
Load wheel dependencies (arm64, 12.6) / Finalize ✅ success view
Load wheel dependencies (arm64, 12.6) / Metadata ✅ success view
Load wheel dependencies (arm64, 13.0) / Caching ✅ success view
Load wheel dependencies (arm64, 13.0) / Finalize ✅ success view
Load wheel dependencies (arm64, 13.0) / Metadata ✅ success view
Prepare cache clean-up ❔ in_progress view
Retrieve PR info ✅ success view
✅ Required checks (6/6) — declared in .github/required-checks.yml for push
Required check Status Link
Build and test (amd64, llvm, openmpi) / Dev environment (Debug) ✅ success view
Build and test (amd64, llvm, openmpi) / Dev environment (Python) ✅ success view
Build and test (arm64, llvm, openmpi) / Dev environment (Debug) ✅ success view
Build and test (arm64, llvm, openmpi) / Dev environment (Python) ✅ success view
Build and test (amd64, gcc12, openmpi) / Dev environment (Debug) ✅ success view
Build and test (amd64, gcc12, openmpi) / Dev environment (Python) ✅ success view

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[custom op] Support unitary synthesis for 3+ qubit operations

2 participants