Summary
Quantizing a TAPEX model (microsoft/tapex-*, task table-question-answering, model_type=bart) for NPU/QNN fails during calibration:
RuntimeError: Failed to create feature-extraction dataset: table input must of type
`pd.DataFrame` (single example), `list[pd.DataFrame]` (batch of examples).
Config generation and ONNX export succeed (the model routes through the (bart, table-question-answering) composite path → encoder=feature-extraction, decoder=text2text-generation); the failure is purely in the calibration-dataset layer when quantizing the encoder.
Reproduction
Full path (e2e_eval, NPU/QNN → w8a16):
uv run scripts/e2e_eval/run_eval.py --hf-model microsoft/tapex-base-finetuned-wikisql --device npu --ep qnn
# -> [FAIL (EXPORT_FAIL)] ... build_build_encoder_failed
Minimal (no NPU, tokenizer only):
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("microsoft/tapex-base-finetuned-wikisql") # TapexTokenizer
tok(["Manhattan is in New York"]) # RuntimeError: table input must of type `pd.DataFrame` ...
Root cause
The encoder sub-component is calibrated as task feature-extraction, which maps to TextDataset (datasets/__init__.py::TASK_DATASET_MAPPING). TextDataset tokenizes plain text with AutoTokenizer.from_pretrained(model_name) (datasets/text.py:158). For TAPEX that returns TapexTokenizer, whose first positional argument is a table (pandas DataFrame); it rejects plain-text input. The error is wrapped at datasets/__init__.py:136. There is no table-aware calibration dataset, so any model whose tokenizer requires non-text inputs cannot be calibrated through the generic text path.
When it triggers
Calibration only runs on the quantizing path (--ep qnn / NPU → w8a16, quantizer.py:112). Non-quantized builds skip calibration and don't hit this.
Proposed fixes (pick one)
- Table-aware calibration dataset — feed a real
pd.DataFrame (+ query) to TapexTokenizer for table-QA / TAPEX models.
- Tokenizer-failure fallback — when the task tokenizer rejects plain text, fall back to
RandomDataset (random tensors from ONNX I/O metadata) instead of failing the build.
- Exclude
table-question-answering from quantized e2e_eval — registry/harness-level filter, if table QA isn't a target for NPU quantization.
Summary
Quantizing a TAPEX model (
microsoft/tapex-*, tasktable-question-answering,model_type=bart) for NPU/QNN fails during calibration:Config generation and ONNX export succeed (the model routes through the
(bart, table-question-answering)composite path → encoder=feature-extraction, decoder=text2text-generation); the failure is purely in the calibration-dataset layer when quantizing the encoder.Reproduction
Full path (e2e_eval, NPU/QNN → w8a16):
Minimal (no NPU, tokenizer only):
Root cause
The encoder sub-component is calibrated as task
feature-extraction, which maps toTextDataset(datasets/__init__.py::TASK_DATASET_MAPPING).TextDatasettokenizes plain text withAutoTokenizer.from_pretrained(model_name)(datasets/text.py:158). For TAPEX that returnsTapexTokenizer, whose first positional argument is atable(pandasDataFrame); it rejects plain-text input. The error is wrapped atdatasets/__init__.py:136. There is no table-aware calibration dataset, so any model whose tokenizer requires non-text inputs cannot be calibrated through the generic text path.When it triggers
Calibration only runs on the quantizing path (
--ep qnn/ NPU → w8a16,quantizer.py:112). Non-quantized builds skip calibration and don't hit this.Proposed fixes (pick one)
pd.DataFrame(+ query) toTapexTokenizerfor table-QA / TAPEX models.RandomDataset(random tensors from ONNX I/O metadata) instead of failing the build.table-question-answeringfrom quantized e2e_eval — registry/harness-level filter, if table QA isn't a target for NPU quantization.