feat(perf): add --ep-options to pass runtime EP provider options (#865)#889
Open
xieofxie wants to merge 1 commit into
Open
feat(perf): add --ep-options to pass runtime EP provider options (#865)#889xieofxie wants to merge 1 commit into
xieofxie wants to merge 1 commit into
Conversation
Adds a repeatable `--ep-options KEY=VALUE` flag to `winml perf` that forwards runtime execution-provider options (e.g. QNN `htp_performance_mode`) to the inference session via `add_provider_for_devices`, for both HuggingFace model IDs and pre-exported ONNX file inputs. The options are threaded through WinMLAutoModel.from_pretrained/from_onnx -> WinMLPreTrainedModel -> WinMLSession. WinMLSession gains a `provider_options` kwarg that merges on top of (and overrides) any build-time `ep_config.provider_options` without affecting EPContext persistence, so these tune the runtime session rather than the compiled graph. Also wired into per-module (`--module`) benchmarking.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #865.
Adds a repeatable
--ep-options KEY=VALUEflag towinml perfthat forwards runtime execution-provider options to the inference session. These options (e.g. QNN HTPhtp_performance_mode) significantly affect runtime latency independently of the build-time quantization config, so being able to set them at benchmark time is essential for tuning.The flag works for both input modes:
winml perf -m microsoft/resnet-50 --ep-options htp_performance_mode=burst)winml perf -m model.onnx --ep-options htp_performance_mode=burst)It is also wired into per-module (
--module) benchmarking.How it works
The options are threaded down to
session.py''sadd_ep_for_device, which already accepts anep_optionsdict:WinMLSessiongains aprovider_optionskwarg that merges on top of and overrides any build-timeep_config.provider_options, but — unlike passing a fullep_config— does not flip EPContext persistence (persist_jit). These options tune the runtime session, not the compiled graph.Changes
utils/cli.py: newep_options_option()decorator andparse_ep_options()helper (reusable by other commands).session/session.py:WinMLSessionacceptsprovider_options, merged overep_configoptions.models/winml/base.py,models/auto.py: threadprovider_optionsthroughfrom_pretrained/from_onnx(including the composite-model and skip-build paths).commands/perf.py: add--ep-options, parse once, pass to both single-model and--modulepaths.docs/commands/perf.md: document the new flag with an example.Tests
tests/unit/utils/test_cli.py:parse_ep_options(empty/single/multiple/=-in-value/duplicate-key/invalid/empty-key).tests/unit/commands/test_perf_cli.py:--ep-optionsforwarded asprovider_optionsfor both ONNX and HF paths; CLI parsing intoBenchmarkConfig; invalid-format rejection; help text.tests/unit/session/test_winml_session.py: runtimeprovider_optionsforwarded toadd_provider_for_devices; runtime options overrideep_configoptions.All affected suites pass (225 passed; 2 pre-existing OpenVINO-EP-dependent failures unrelated to this change, deselected on this machine where OpenVINO is not installed).
🤖 Generated with Claude Code