Quantization calibration fails for TAPEX (table-question-answering): TextDataset feeds plain text to TapexTokenizer, which requires a DataFrame

## Summary

Quantizing a TAPEX model (`microsoft/tapex-*`, task `table-question-answering`, `model_type=bart`) for NPU/QNN fails during calibration:

```
RuntimeError: Failed to create feature-extraction dataset: table input must of type
`pd.DataFrame` (single example), `list[pd.DataFrame]` (batch of examples).
```

Config generation and ONNX export succeed (the model routes through the `(bart, table-question-answering)` composite path → encoder=`feature-extraction`, decoder=`text2text-generation`); the failure is purely in the calibration-dataset layer when quantizing the encoder.

## Reproduction

Full path (e2e_eval, NPU/QNN → w8a16):

```
uv run scripts/e2e_eval/run_eval.py --hf-model microsoft/tapex-base-finetuned-wikisql --device npu --ep qnn
# -> [FAIL (EXPORT_FAIL)] ... build_build_encoder_failed
```

Minimal (no NPU, tokenizer only):

```python
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("microsoft/tapex-base-finetuned-wikisql")  # TapexTokenizer
tok(["Manhattan is in New York"])  # RuntimeError: table input must of type `pd.DataFrame` ...
```

## Root cause

The encoder sub-component is calibrated as task `feature-extraction`, which maps to `TextDataset` (`datasets/__init__.py::TASK_DATASET_MAPPING`). `TextDataset` tokenizes plain text with `AutoTokenizer.from_pretrained(model_name)` (`datasets/text.py:158`). For TAPEX that returns `TapexTokenizer`, whose first positional argument is a `table` (pandas `DataFrame`); it rejects plain-text input. The error is wrapped at `datasets/__init__.py:136`. There is no table-aware calibration dataset, so any model whose tokenizer requires non-text inputs cannot be calibrated through the generic text path.

## When it triggers

Calibration only runs on the quantizing path (`--ep qnn` / NPU → w8a16, `quantizer.py:112`). Non-quantized builds skip calibration and don't hit this.

## Proposed fixes (pick one)

- **Table-aware calibration dataset** — feed a real `pd.DataFrame` (+ query) to `TapexTokenizer` for table-QA / TAPEX models.
- **Tokenizer-failure fallback** — when the task tokenizer rejects plain text, fall back to `RandomDataset` (random tensors from ONNX I/O metadata) instead of failing the build.
- **Exclude `table-question-answering` from quantized e2e_eval** — registry/harness-level filter, if table QA isn't a target for NPU quantization.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantization calibration fails for TAPEX (table-question-answering): TextDataset feeds plain text to TapexTokenizer, which requires a DataFrame #842

Summary

Reproduction

Root cause

When it triggers

Proposed fixes (pick one)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Quantization calibration fails for TAPEX (table-question-answering): TextDataset feeds plain text to TapexTokenizer, which requires a DataFrame #842

Description

Summary

Reproduction

Root cause

When it triggers

Proposed fixes (pick one)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions