Skip to content

[BUG] Embedding path ignores ModelConfig.spec.tls — embeddings fail behind TLS-inspecting proxies even with disableVerify set #1992

@davidkarlsen

Description

@davidkarlsen

Summary

The embedding code path in kagent-adk ignores ModelConfig.spec.tls entirely. Even when an embedding ModelConfig carries spec.tls.disableVerify: true (or caCertSecretRef), the embedding client is constructed with no TLS configuration, so embeddings fail in any environment where the default certificate verification can't succeed (e.g. behind a TLS-inspecting corporate proxy).

This is closely related to #1991 (the Helm chart can't pass spec.tls onto the chart-generated default ModelConfig), but it is a distinct, separate gap: this one is in the embedding client code itself, not the chart — and it affects embeddings even when the ModelConfig is hand-authored with spec.tls set. A fix for #1991 will not fix this, and vice versa.

What happens

The chat / LLM path correctly honors spec.tls — it threads tls_disable_verify into a custom httpx client factory (kagent-adk types.py, ~L594-624). Setting spec.tls.disableVerify: true on a standalone ModelConfig makes the chat path work.

The embedding path does not. In packages/kagent-adk/src/kagent/adk/models/_embedding.py (_embed_openai):

client = AsyncOpenAI(base_url=self.config.base_url or None)

No http_client / verify override is passed. And the embedding config object has only model / provider / base_url fields — there is no tls field on the embedding config at all. So the spec.tls block on the referenced ModelConfig is silently unused for embeddings.

Symptom

Session-memory auto-save (which uses embeddings) fails with:

kagent.adk.models._embedding - ERROR - Error generating embedding with provider=openai model=<embedding-model>: Connection error.

Empirical confirmation

From inside an agent pod (embedding model rai/eu.cohere.embed-v4:0), against the same gateway:

Client construction Result
httpx ... verify=False HTTP 200 — endpoint + token are fine
default AsyncOpenAI(base_url=...) (what runs in prod) APIConnectionError('Connection error.') — identical to the log above
httpx ... verify=<mounted corporate CA bundle> CERTIFICATE_VERIFY_FAILED

So the endpoint and credentials are correct; the only difference is that the embedding client has no way to be told how to handle TLS — there's no equivalent of the LLM path's tls_disable_verify.

(Underlying trigger in our environment: the egress TLS-inspection proxy's issuing CA is missing the Authority Key Identifier extension, which Python 3.13's default strict X.509 verification rejects. That's a proxy-side conformance problem, but it's exactly the situation spec.tls.disableVerify / caCertSecretRef exists to handle — and the embedding path can't reach it.)

Request

Thread ModelConfig.spec.tls into the embedding client the same way the chat path does — e.g. construct AsyncOpenAI(http_client=httpx.AsyncClient(verify=...)) from the config's tls settings (disableVerify, caCertSecretRef / caCertSecretKey, disableSystemCAs). Ideally the embedding config gains the same tls field that the LLM ModelConfig already has, with identical semantics.

Related

Chart version observed: 0.9.6 (oci://ghcr.io/kagent-dev/kagent/helm/kagent --version 0.9.6).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions