P1-FEATURE-007: Graph Optimizer — QLinear * Rewrite

## Summary

Implement pattern-based rewriting for QLinear operators (QLinearConv, QLinearMatMul) to enable quantization-aware graph transformations needed for INT8/INT4 model deployment on NPUs.

## Context

QLinear operators are ONNX's representation of quantized computations. Fusing and rewriting these operators (e.g., QLinearConv → fused INT8 kernel) is required for efficient execution on hardware NPUs that support quantized computation natively. This is the final graph optimizer capability needed for QDQ pipeline completion.

From:
- `plans/release/0315_release_plan/P1_CHECKLIST.md` (P1-FEATURE-006)
- `plans/release/0501_release_plan/P0_CHECKLIST.md` (P1-FEATURE-007)

## Current State

- Graph optimizer supports FP32/FP16 rewrites (#397) and attribute changes (#396)
- No QLinear-specific rewrite rules implemented
- QDQ pipeline (#401) produces QLinear operators that need post-processing

## Desired State

- QLinear * patterns recognized and rewritten by the graph optimizer
- QLinearConv + Bias fusion
- QLinearMatMul → INT8 GEMM rewrite (where EP supports it)
- Quantization-scale/zero-point folding

## Acceptance Criteria

- [ ] QLinearConv fusion rule implemented and tested
- [ ] QLinearMatMul rewrite rule implemented and tested
- [ ] Scale/zero-point folding across consecutive QDQ patterns
- [ ] All QLinear rewrite rules tested against QDQ-quantized P0 models
- [ ] Runtime-validated: quantized model output matches pre-rewrite output within tolerance
- [ ] All existing tests pass (CARDINAL RULE: no regressions)

## Technical Notes

- Must follow CARDINAL RULE #1: no hardcoded model architecture assumptions — all pattern matching must be graph-structure-based
- QLinear op spec: https://onnx.ai/onnx/operators/onnx__QLinearConv.html
- Test with QDQ output from #401 on at least 3 architectures (CNN, BERT, ViT)

## Related Files

- `plans/release/0315_release_plan/feature-scale.md` — P1.4 QLinear Rewrite
- `plans/release/0501_release_plan/feature-scale.md` — P1.7 QLinear Rewrite
- `plans/release/0315_release_plan/P1_CHECKLIST.md` — P1-FEATURE-006
- `plans/release/0501_release_plan/P0_CHECKLIST.md` — P1-FEATURE-007
- #401 — QDQ quantization (upstream dependency)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P1-FEATURE-007: Graph Optimizer — QLinear * Rewrite #154

Summary

Context

Current State

Desired State

Acceptance Criteria

Technical Notes

Related Files

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

P1-FEATURE-007: Graph Optimizer — QLinear * Rewrite #154

Description

Summary

Context

Current State

Desired State

Acceptance Criteria

Technical Notes

Related Files

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions