Skip to content

feat: add --ep-options to winml perf (and other commands) for runtime EP provider options #865

@DingmaomaoBJTU

Description

@DingmaomaoBJTU

Motivation

When benchmarking or running inference on QNN NPU (and other EPs), the runtime EP provider options can significantly affect latency — independently of the build-time quantization config. For example, on QNN HTP:

EP option Affects compile Affects runtime Values
\htp_performance_mode\ ❌ (no-op) ✅ clock governor \�urst, \high_performance, \�alanced, \low_power, \default\
\htp_graph_finalization_optimization_mode\ ✅ changes compiled graph \

Metadata

Metadata

Assignees

Labels

P2Medium — minor bug or non-critical improvementfeature scaleFeature scale work itemtriagedIssue has been triaged

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions