Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The

## [Unreleased]

### Added

- **HyperDX OTel integration test path and "Production swap" docs in example 03.** `examples/03-observer-hooks/main.py`'s module docstring grows a "Production swap" section showing how to substitute the demo's `SimpleSpanProcessor` + `ConsoleSpanExporter` for `BatchSpanProcessor` + `OTLPSpanExporter` pointed at HyperDX (or any other OTLP-HTTP collector). A new opt-in integration test (`tests/integration/test_otel_hyperdx_export.py`, gated by `HYPERDX_API_KEY` + `HYPERDX_OTLP_ENDPOINT` env vars and `@pytest.mark.integration`) drives the same production export path end-to-end against a live endpoint. `opentelemetry-exporter-otlp-proto-http` lands as a dev-only dep; not promoted to a public extras group yet.

### Changed (breaking)

- **`OpenAIProvider.ready()` default probe flipped to `chat_completions`.** A new constructor kwarg `readiness_probe: Literal["models", "chat_completions", "both"]` selects which wire path `ready()` exercises; the default is now the chat-completions path (`POST /v1/chat/completions` with `max_tokens=1`), which actually exercises the inference path. The previous catalog-only behavior is still available as `readiness_probe="models"`, and `readiness_probe="both"` runs catalog then chat for the strongest signal. Motivation: OpenAI-compatible proxies (Bifrost and similar) can return 200 on `GET /v1/models` while rejecting `POST /v1/chat/completions`, leaving the catalog probe green while every real call fails. The new default surfaces that class of failure at preflight rather than at first inference. Non-200 chat-probe responses route through `classify_http_error`, so the canonical error categories (`provider_authentication`, `provider_unavailable`, `provider_invalid_model`, etc.) surface consistently. Callers that depended on the catalog-only behavior (cost-sensitive cloud setups where every `ready()` would now bill prompt tokens) can opt back in by passing `readiness_probe="models"`.
Expand Down
40 changes: 40 additions & 0 deletions examples/03-observer-hooks/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,46 @@
LLM_API_KEY=sk-... uv run python main.py "explain why NASA is returning to the moon with Artemis"

(``--all-extras`` pulls in ``opentelemetry-sdk`` for the OTel observer.)

**Production swap: real OTLP exporter (e.g. HyperDX).**

The example wires ``OTelObserver`` to a ``SimpleSpanProcessor`` +
``ConsoleSpanExporter`` so every span prints to stdout. That is fine
for a short-lived demo and wrong for production: synchronous export
blocks each node boundary, and printing is not ingestion. For a real
backend (HyperDX, Honeycomb, Tempo, any OTLP-HTTP collector), swap to
``BatchSpanProcessor`` + ``OTLPSpanExporter`` pointing at your
collector and supplying its auth header. The HyperDX shape::

from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor

otel_observer = OTelObserver(
span_processor=BatchSpanProcessor(
OTLPSpanExporter(
endpoint="https://in-otel.hyperdx.io/v1/traces",
# HyperDX accepts the API key as a bare ``authorization``
# value. Other collectors expect ``Bearer <token>``;
# check your destination's docs. The bracket-form
# ``os.environ[...]`` is intentional: unlike ``LLM_API_KEY``
# (which permits None for unauthenticated local servers),
# a missing HyperDX key would silently send unauthenticated
# requests, so fail-loud at boot is the right shape.
headers={"authorization": os.environ["HYPERDX_API_KEY"]},
)
),
resource=Resource.create({"service.name": "openarmature-demo-answers"}),
)

Same observer call surface; only the processor + exporter change. The
``OTLPSpanExporter`` lives in the ``opentelemetry-exporter-otlp-proto-http``
package (not in ``[otel]`` extras yet; install it directly while OA
gauges demand). Before short-lived processes exit, call
``await graph.drain()`` (drains the observer's per-invocation event
queue so spans see their ``completed`` events) and then
``otel_observer.force_flush()`` (synchronous; pushes
``BatchSpanProcessor``'s tail through the exporter). The drain + flush
pair ensures the tail lands before teardown.
"""

from __future__ import annotations
Expand Down
5 changes: 5 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ dev = [
"pyyaml>=6.0",
"ruff>=0.5",
"types-pyyaml",
# Used only by ``tests/integration/test_otel_hyperdx_export.py``
# against a live HyperDX endpoint. Not promoted to a public extras
# group yet: one downstream user, one destination; revisit when
# multiple users want OTLP-HTTP export packaged.
"opentelemetry-exporter-otlp-proto-http>=1.27,<3",
]
docs = [
"mkdocs>=1.6,<2",
Expand Down
125 changes: 125 additions & 0 deletions tests/integration/test_otel_hyperdx_export.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
"""Integration test for OTel span export against a live HyperDX endpoint.

Gated by the presence of ``HYPERDX_API_KEY`` + ``HYPERDX_OTLP_ENDPOINT``
env vars. Skipped in CI and local runs that don't have credentials in
scope; runs end-to-end against HyperDX Cloud (or any other OTLP-HTTP
collector) when invoked from a shell with both env vars sourced.

``HYPERDX_OTLP_ENDPOINT`` MUST be the full traces-collector URL
including the ``/v1/traces`` path suffix, e.g.::

HYPERDX_OTLP_ENDPOINT=https://in-otel.hyperdx.io/v1/traces

``OTLPSpanExporter`` uses the ``endpoint`` kwarg verbatim and does
not append the path itself (that auto-append only happens for the
``OTEL_EXPORTER_OTLP_ENDPOINT`` host-only convention this test does
not use). A host-only URL POSTs to ``/`` and HyperDX 404s.

The test verifies the production export path the documentation
recommends (``BatchSpanProcessor`` + ``OTLPSpanExporter``) drains
cleanly from the local pipeline. The assertion is local-side: the
BatchSpanProcessor's ``force_flush`` succeeded within the deadline.
HyperDX-side acceptance (auth, payload accepted, span visible in the
UI) is verified by checking the HyperDX UI for a span named ``ping``
under service ``openarmature-hyperdx-integration``; the OTel SDK
swallows exporter errors silently, so a local-side success does not
prove the collector received the spans.
"""

from __future__ import annotations

import os

import pytest

# Skip the entire module when credentials / endpoint aren't sourced.
# Avoids an ImportError cascade from the OTLP exporter if its env-var
# fallback also can't find a target.
pytestmark = pytest.mark.skipif(
not (os.environ.get("HYPERDX_API_KEY") and os.environ.get("HYPERDX_OTLP_ENDPOINT")),
reason=(
"Requires HYPERDX_API_KEY + HYPERDX_OTLP_ENDPOINT (live HyperDX endpoint); "
"endpoint MUST include the /v1/traces path suffix"
),
)


@pytest.mark.integration
async def test_otel_observer_pipeline_drains_with_hyperdx_exporter() -> None:
"""End-to-end: invoke a tiny graph under an OTelObserver wired to
the OTLPSpanExporter pointing at the configured HyperDX endpoint,
flush, and assert the local pipeline drained within the deadline.
"""
# Imports inside the function so the heavy OTLP-protobuf
# dependencies don't load when the module is collected and skipped
# under the default ``-m "not integration"`` pytest filter.
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace.export import BatchSpanProcessor

from openarmature.graph import END, GraphBuilder, State
from openarmature.observability.otel import OTelObserver

# Enforce the documented endpoint shape at runtime. The
# ``OTLPSpanExporter`` uses the URL verbatim and does not append
# ``/v1/traces`` itself, so a host-only URL POSTs to ``/`` and
# HyperDX 404s; the SDK swallows that response and ``force_flush``
# still returns True, which would mask a misconfigured env var
# behind a passing test.
endpoint = os.environ["HYPERDX_OTLP_ENDPOINT"]
assert endpoint.endswith("/v1/traces"), (
f"HYPERDX_OTLP_ENDPOINT must end with /v1/traces (got {endpoint!r}); "
"OTLPSpanExporter uses the URL verbatim and does not append paths."
)

# HyperDX accepts the API key as a bare ``authorization`` header
# value (no ``Bearer`` prefix). Other OTLP collectors that expect
# ``Bearer <token>`` will need the caller to format the header
# themselves; this is the documented HyperDX shape.
exporter = OTLPSpanExporter(
endpoint=endpoint,
headers={"authorization": os.environ["HYPERDX_API_KEY"]},
)
Comment thread
chris-colinsky marked this conversation as resolved.

observer = OTelObserver(
span_processor=BatchSpanProcessor(exporter),
resource=Resource.create({"service.name": "openarmature-hyperdx-integration"}),
)

class _PingState(State):
ping: bool = False

async def _node(_s: _PingState) -> dict[str, bool]:
return {"ping": True}

graph = GraphBuilder(_PingState).add_node("ping", _node).add_edge("ping", END).set_entry("ping").compile()
Comment thread
chris-colinsky marked this conversation as resolved.
graph.attach_observer(observer)

try:
final = await graph.invoke(_PingState())
assert final.ping is True

# ``invoke()`` returns when the graph reaches END but observer
# events sit on a per-invocation queue until the background
# worker drains them. Without ``drain()``, a span that hasn't
# yet seen its ``completed`` event is still open when
# ``force_flush`` runs, and the exporter would ship only the
# ``started`` half (or nothing at all). The short-lived-process
# pattern in ``docs/agent/non-obvious-shapes.md`` makes this
# explicit.
await graph.drain()

# Local-side assertion. ``BatchSpanProcessor.force_flush``
# returns True when every registered processor finishes
# flushing within the timeout, False when any one times out.
# The OTel SDK swallows exporter-side errors (401s, schema
# rejections) silently, so a True here proves the pipeline
# drained but not that HyperDX accepted the payload; that
# confirmation is in the HyperDX UI.
flushed = observer.force_flush(timeout_ms=15_000)
assert flushed, "BatchSpanProcessor did not finish flushing within 15s"
Comment thread
chris-colinsky marked this conversation as resolved.
finally:
# Releases the BatchSpanProcessor's background export thread;
# ``OTelObserver.shutdown`` is idempotent and calls
# ``_provider.shutdown`` under the hood.
observer.shutdown()
2 changes: 2 additions & 0 deletions uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.