Skip to content

P1-FEATURE-007: Graph Optimizer — QLinear * Rewrite #154

@DingmaomaoBJTU

Description

@DingmaomaoBJTU

Summary

Implement pattern-based rewriting for QLinear operators (QLinearConv, QLinearMatMul) to enable quantization-aware graph transformations needed for INT8/INT4 model deployment on NPUs.

Context

QLinear operators are ONNX's representation of quantized computations. Fusing and rewriting these operators (e.g., QLinearConv → fused INT8 kernel) is required for efficient execution on hardware NPUs that support quantized computation natively. This is the final graph optimizer capability needed for QDQ pipeline completion.

From:

  • plans/release/0315_release_plan/P1_CHECKLIST.md (P1-FEATURE-006)
  • plans/release/0501_release_plan/P0_CHECKLIST.md (P1-FEATURE-007)

Current State

Desired State

  • QLinear * patterns recognized and rewritten by the graph optimizer
  • QLinearConv + Bias fusion
  • QLinearMatMul → INT8 GEMM rewrite (where EP supports it)
  • Quantization-scale/zero-point folding

Acceptance Criteria

  • QLinearConv fusion rule implemented and tested
  • QLinearMatMul rewrite rule implemented and tested
  • Scale/zero-point folding across consecutive QDQ patterns
  • All QLinear rewrite rules tested against QDQ-quantized P0 models
  • Runtime-validated: quantized model output matches pre-rewrite output within tolerance
  • All existing tests pass (CARDINAL RULE: no regressions)

Technical Notes

Related Files

  • plans/release/0315_release_plan/feature-scale.md — P1.4 QLinear Rewrite
  • plans/release/0501_release_plan/feature-scale.md — P1.7 QLinear Rewrite
  • plans/release/0315_release_plan/P1_CHECKLIST.md — P1-FEATURE-006
  • plans/release/0501_release_plan/P0_CHECKLIST.md — P1-FEATURE-007
  • Lift derived head_dim into a shared NormalizedConfig base class #401 — QDQ quantization (upstream dependency)

Metadata

Metadata

Assignees

Labels

P1High — major feature broken or significant UX impactQDQQDQ quantizationfeature scaleFeature scale work itemgraph-optimizerGraph optimizer moduletriagedIssue has been triaged

Type

No fields configured for Task.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions