Skip to content

Quantization calibration fails for TAPEX (table-question-answering): TextDataset feeds plain text to TapexTokenizer, which requires a DataFrame #842

@timenick

Description

@timenick

Summary

Quantizing a TAPEX model (microsoft/tapex-*, task table-question-answering, model_type=bart) for NPU/QNN fails during calibration:

RuntimeError: Failed to create feature-extraction dataset: table input must of type
`pd.DataFrame` (single example), `list[pd.DataFrame]` (batch of examples).

Config generation and ONNX export succeed (the model routes through the (bart, table-question-answering) composite path → encoder=feature-extraction, decoder=text2text-generation); the failure is purely in the calibration-dataset layer when quantizing the encoder.

Reproduction

Full path (e2e_eval, NPU/QNN → w8a16):

uv run scripts/e2e_eval/run_eval.py --hf-model microsoft/tapex-base-finetuned-wikisql --device npu --ep qnn
# -> [FAIL (EXPORT_FAIL)] ... build_build_encoder_failed

Minimal (no NPU, tokenizer only):

from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("microsoft/tapex-base-finetuned-wikisql")  # TapexTokenizer
tok(["Manhattan is in New York"])  # RuntimeError: table input must of type `pd.DataFrame` ...

Root cause

The encoder sub-component is calibrated as task feature-extraction, which maps to TextDataset (datasets/__init__.py::TASK_DATASET_MAPPING). TextDataset tokenizes plain text with AutoTokenizer.from_pretrained(model_name) (datasets/text.py:158). For TAPEX that returns TapexTokenizer, whose first positional argument is a table (pandas DataFrame); it rejects plain-text input. The error is wrapped at datasets/__init__.py:136. There is no table-aware calibration dataset, so any model whose tokenizer requires non-text inputs cannot be calibrated through the generic text path.

When it triggers

Calibration only runs on the quantizing path (--ep qnn / NPU → w8a16, quantizer.py:112). Non-quantized builds skip calibration and don't hit this.

Proposed fixes (pick one)

  • Table-aware calibration dataset — feed a real pd.DataFrame (+ query) to TapexTokenizer for table-QA / TAPEX models.
  • Tokenizer-failure fallback — when the task tokenizer rejects plain text, fall back to RandomDataset (random tensors from ONNX I/O metadata) instead of failing the build.
  • Exclude table-question-answering from quantized e2e_eval — registry/harness-level filter, if table QA isn't a target for NPU quantization.

Metadata

Metadata

Assignees

Labels

P2Medium — minor bug or non-critical improvementbugSomething isn't workingqualityUse for quality control related issuestriagedIssue has been triaged

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions