Summary
The two offline task-detection paths disagree for a config that has no resolvable architectures: inspect returns a fallback task while the model-loading commands raise ValueError.
detect_task(config) — used by inspect, eval, and config's seq2seq composite routing
resolve_task_and_model_class(config, task=None) — used by the model loader, i.e. build / export / perf / run / serve / quantize, and config's single-config path (via generate_hf_build_config)
For any real checkpoint the two agree (they share the D2 modality upgrade and converge on TasksManager architecture-head detection). They diverge only when the config carries no resolvable architecture — e.g. an architecture-less --model-type X invocation (no -m checkpoint), or a default/synthetic config.
Reproduction
from transformers import SamConfig, CLIPConfig
from winml.modelkit.loader import detect_task, resolve_task_and_model_class
detect_task(SamConfig()) # -> ('mask-generation', 'HF_MODEL_CLASS_MAPPING')
resolve_task_and_model_class(SamConfig()) # -> ValueError
detect_task(CLIPConfig()) # -> ('next-sentence-prediction', 'HF_TASK_DEFAULTS')
resolve_task_and_model_class(CLIPConfig()) # -> ValueError
At the command surface:
winml inspect --model-type sam # task = mask-generation
winml build --model-type sam # (or perf / run / quantize) errors: task cannot be resolved
So it is not "two different valid tasks" — it is inspect is lenient (returns a fallback), the loader-backed commands are strict (raise).
Root cause
detect_task (loader/task.py) has two fallbacks that resolve_task_and_model_class does not:
- the
HF_MODEL_CLASS_MAPPING step-1 short-circuit, which fires on model_type alone with no architecture needed (e.g. sam -> mask-generation);
- the
HF_TASK_DEFAULTS fallback (-> next-sentence-prediction).
resolve_task_and_model_class Case 1 goes straight to _detect_task_and_class_from_config, which must resolve the architecture/model class and raises when it cannot.
Scope / severity
Low. Only affects configs without resolvable architectures (architecture-less --model-type invocations, synthetic/default configs). Every real checkpoint carries architectures, so the two paths agree there — verified across bert (MaskedLM/SeqClass), bart (CondGen/SeqClass), t5 (CondGen), resnet, dinov2 (D2 modality), sam, and clip with architectures present.
Options
- Document as intended —
inspect is a lenient reporter; loader-backed commands are strict and should reject what they cannot load. (current behavior)
- Parity — either give
resolve_task_and_model_class the same fallbacks (lenient: both succeed), or drop the fallbacks from detect_task (strict: both raise) so inspect and the model-loading commands always agree.
Either way, a consistency test comparing detect_task vs resolve_task_and_model_class (not just detect_task vs inspect_detect_task, which is all tests/integration/test_task_consistency.py covers today) would lock the invariant.
Summary
The two offline task-detection paths disagree for a config that has no resolvable
architectures:inspectreturns a fallback task while the model-loading commands raiseValueError.detect_task(config)— used byinspect,eval, andconfig's seq2seq composite routingresolve_task_and_model_class(config, task=None)— used by the model loader, i.e.build/export/perf/run/serve/quantize, andconfig's single-config path (viagenerate_hf_build_config)For any real checkpoint the two agree (they share the D2 modality upgrade and converge on
TasksManagerarchitecture-head detection). They diverge only when the config carries no resolvable architecture — e.g. an architecture-less--model-type Xinvocation (no-mcheckpoint), or a default/synthetic config.Reproduction
At the command surface:
So it is not "two different valid tasks" — it is
inspectis lenient (returns a fallback), the loader-backed commands are strict (raise).Root cause
detect_task(loader/task.py) has two fallbacks thatresolve_task_and_model_classdoes not:HF_MODEL_CLASS_MAPPINGstep-1 short-circuit, which fires onmodel_typealone with no architecture needed (e.g.sam->mask-generation);HF_TASK_DEFAULTSfallback (->next-sentence-prediction).resolve_task_and_model_classCase 1 goes straight to_detect_task_and_class_from_config, which must resolve the architecture/model class and raises when it cannot.Scope / severity
Low. Only affects configs without resolvable
architectures(architecture-less--model-typeinvocations, synthetic/default configs). Every real checkpoint carriesarchitectures, so the two paths agree there — verified across bert (MaskedLM/SeqClass), bart (CondGen/SeqClass), t5 (CondGen), resnet, dinov2 (D2 modality), sam, and clip with architectures present.Options
inspectis a lenient reporter; loader-backed commands are strict and should reject what they cannot load. (current behavior)resolve_task_and_model_classthe same fallbacks (lenient: both succeed), or drop the fallbacks fromdetect_task(strict: both raise) soinspectand the model-loading commands always agree.Either way, a consistency test comparing
detect_taskvsresolve_task_and_model_class(not justdetect_taskvsinspect_detect_task, which is alltests/integration/test_task_consistency.pycovers today) would lock the invariant.