Skip to content

obol-stack-side model ranking: keep, deprecate, or remove? (audit findings) #399

@bussyjd

Description

@bussyjd

Context

PR #379 (obol model prefer + provider smokes) was closed in favour of doing model control on the calling agent side rather than the LiteLLM side. The premise: each agent (Hermes, OpenClaw, future) has its own ordering and fallback mechanism, so it's redundant for obol-stack to also pick.

This issue documents an audit of whether that premise actually holds today, and what the cleanup path is.

Code under review

obol-stack maintains a model-ranking layer that picks a primary + fallbacks before the agents see the model list:

  • Core: internal/model/rank.go::Rank — capability-aware ranker (cloud > local; cloud quality table; local by parameter count). ~221 lines + rank_test.go.
  • OpenClaw wrapper: internal/openclaw/openclaw.go::rankModels — wraps Rank, adds openai/ prefix.
  • OpenClaw hand-off: internal/openclaw/openclaw.go::patchModelHierarchy — writes agents.defaults.model.{primary,fallbacks} into openclaw-config ConfigMap.
  • Hermes wrapper: internal/hermes/hermes.go::rankModels — wraps Rank (no prefix).
  • Hermes hand-off: internal/hermes/hermes.go::generateValues bakes primary into the API_SERVER_MODEL_NAME env var via helm values; generateConfig writes model.default = primary into the agent config.

Audit findings

Spawned read-only audits of both runtime repos.

OpenClaw (/Users/bussyjd/Development/R&D/openclaw)

  • Treats agents.defaults.model.primary as a hard pin, not a hint — src/agents/model-selection.ts:392-422, src/agents/agent-scope.ts:186-195.
  • Does NOT query LiteLLM /v1/models at runtime. Catalog loads once at startup from a bundled registry (src/agents/model-catalog.ts:80+), used only for display/validation.
  • Has fallback orchestration on error (src/agents/model-fallback.ts:78+) — walks the explicit fallbacks list in order. No capability ranking or auto-discovery.
  • If primary is empty/missing, falls back to hardcoded defaults in src/agents/defaults.ts — silently downgrades every agent.

Hermes (/Users/bussyjd/Development/R&D/hermes-agent, upstream nousresearch/hermes-agent)

  • Treats API_SERVER_MODEL_NAME and config model.default as hard pins. Read at cli.py:1797, gateway/platforms/api_server.py:541.
  • Does NOT query LiteLLM /v1/models at runtime. No code path discovers models dynamically from the LiteLLM sidecar. Pulls model lists from OpenRouter/Anthropic/provider catalogs only.
  • Has fallback chain via run_agent.py:6089::_try_activate_fallback — switches on error using a user-configured fallback_model list. No capability ranking, no auto-discovery.
  • If API_SERVER_MODEL_NAME is empty, the API server advertises a placeholder name on /v1/models; the CLI agent itself requires a model and fails to initialize.

Verdict

The premise that "each agent has its own ordering" doesn't fully hold. Both runtimes have reactive fallback (try next on error), not proactive capability ranking. Both treat the obol-stack-supplied primary as a hard pin and would silently downgrade or fail without it.

So Rank is currently load-bearing. Removing it today (without coordinated upstream changes) breaks both runtimes.

Options

A. Keep Rank, document the contract — recommended now

Add a header comment to internal/model/rank.go and to both rankModels wrappers stating: "obol-stack picks a sensible primary because neither Hermes nor OpenClaw auto-discover from the LiteLLM model_list at runtime. Until they do, this layer is load-bearing." No code change. Eliminates "is this dead?" confusion for future readers. ~30 min.

B. Drop the wrappers, inline model.Rank(...) — minor cleanup

Saves ~30 lines of indirection. Doesn't change runtime behavior. Doesn't address the design question. ~1h. Low value if we're going to do A.

C. Full removal — blocked on upstream

Drop Rank, both wrappers, patchModelHierarchy, and the primary env-var/config plumbing. Push only the LiteLLM model_list to agents. Requires both upstream repos to add a startup-time /v1/models query and pick their own primary. Needs design alignment with the OpenClaw and Hermes maintainers before any obol-stack change.

Recommendation

A now. Open follow-up RFCs / coordinated PRs in openclaw (and an upstream issue/PR in nousresearch/hermes-agent) to add /v1/models auto-discovery at startup. Once both runtimes can stand on their own, revisit C.

Closing PR #379 was correct

The closure rationale ("model control on the calling agent side") is the right destination. This issue captures how to get there — the cleanup is gated on upstream changes, not on obol-stack alone.

cc @OisinKyne

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions