Task detection diverges between inspect and the model loader for configs without architectures

## Summary

The two offline task-detection paths disagree for a config that has **no resolvable `architectures`**: `inspect` returns a fallback task while the model-loading commands raise `ValueError`.

- **`detect_task(config)`** — used by `inspect`, `eval`, and `config`'s seq2seq composite routing
- **`resolve_task_and_model_class(config, task=None)`** — used by the model loader, i.e. `build` / `export` / `perf` / `run` / `serve` / `quantize`, and `config`'s single-config path (via `generate_hf_build_config`)

For any **real checkpoint** the two agree (they share the D2 modality upgrade and converge on `TasksManager` architecture-head detection). They diverge only when the config carries no resolvable architecture — e.g. an architecture-less `--model-type X` invocation (no `-m` checkpoint), or a default/synthetic config.

## Reproduction

```python
from transformers import SamConfig, CLIPConfig
from winml.modelkit.loader import detect_task, resolve_task_and_model_class

detect_task(SamConfig())                    # -> ('mask-generation', 'HF_MODEL_CLASS_MAPPING')
resolve_task_and_model_class(SamConfig())   # -> ValueError

detect_task(CLIPConfig())                   # -> ('next-sentence-prediction', 'HF_TASK_DEFAULTS')
resolve_task_and_model_class(CLIPConfig())  # -> ValueError
```

At the command surface:

```
winml inspect --model-type sam     # task = mask-generation
winml build   --model-type sam     # (or perf / run / quantize) errors: task cannot be resolved
```

So it is not "two different valid tasks" — it is **`inspect` is lenient (returns a fallback), the loader-backed commands are strict (raise)**.

## Root cause

`detect_task` (`loader/task.py`) has two fallbacks that `resolve_task_and_model_class` does not:

1. the `HF_MODEL_CLASS_MAPPING` step-1 short-circuit, which fires on `model_type` alone with no architecture needed (e.g. `sam` -> `mask-generation`);
2. the `HF_TASK_DEFAULTS` fallback (-> `next-sentence-prediction`).

`resolve_task_and_model_class` Case 1 goes straight to `_detect_task_and_class_from_config`, which must resolve the architecture/model class and raises when it cannot.

## Scope / severity

**Low.** Only affects configs without resolvable `architectures` (architecture-less `--model-type` invocations, synthetic/default configs). Every real checkpoint carries `architectures`, so the two paths agree there — verified across bert (MaskedLM/SeqClass), bart (CondGen/SeqClass), t5 (CondGen), resnet, dinov2 (D2 modality), sam, and clip with architectures present.

## Options

1. **Document as intended** — `inspect` is a lenient reporter; loader-backed commands are strict and should reject what they cannot load. (current behavior)
2. **Parity** — either give `resolve_task_and_model_class` the same fallbacks (lenient: both succeed), or drop the fallbacks from `detect_task` (strict: both raise) so `inspect` and the model-loading commands always agree.

Either way, a consistency test comparing `detect_task` vs `resolve_task_and_model_class` (not just `detect_task` vs `inspect_detect_task`, which is all `tests/integration/test_task_consistency.py` covers today) would lock the invariant.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task detection diverges between inspect and the model loader for configs without architectures #864

Summary

Reproduction

Root cause

Scope / severity

Options

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Task detection diverges between inspect and the model loader for configs without architectures #864

Description

Summary

Reproduction

Root cause

Scope / severity

Options

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions