LunarCommand · chris-colinsky · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,6 +6,10 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
 
 ## [Unreleased]
 
+### Added
+
+- **HyperDX OTel integration test path and "Production swap" docs in example 03.** `examples/03-observer-hooks/main.py`'s module docstring grows a "Production swap" section showing how to substitute the demo's `SimpleSpanProcessor` + `ConsoleSpanExporter` for `BatchSpanProcessor` + `OTLPSpanExporter` pointed at HyperDX (or any other OTLP-HTTP collector). A new opt-in integration test (`tests/integration/test_otel_hyperdx_export.py`, gated by `HYPERDX_API_KEY` + `HYPERDX_OTLP_ENDPOINT` env vars and `@pytest.mark.integration`) drives the same production export path end-to-end against a live endpoint. `opentelemetry-exporter-otlp-proto-http` lands as a dev-only dep; not promoted to a public extras group yet.
+
 ### Changed (breaking)
 
 - **`OpenAIProvider.ready()` default probe flipped to `chat_completions`.** A new constructor kwarg `readiness_probe: Literal["models", "chat_completions", "both"]` selects which wire path `ready()` exercises; the default is now the chat-completions path (`POST /v1/chat/completions` with `max_tokens=1`), which actually exercises the inference path. The previous catalog-only behavior is still available as `readiness_probe="models"`, and `readiness_probe="both"` runs catalog then chat for the strongest signal. Motivation: OpenAI-compatible proxies (Bifrost and similar) can return 200 on `GET /v1/models` while rejecting `POST /v1/chat/completions`, leaving the catalog probe green while every real call fails. The new default surfaces that class of failure at preflight rather than at first inference. Non-200 chat-probe responses route through `classify_http_error`, so the canonical error categories (`provider_authentication`, `provider_unavailable`, `provider_invalid_model`, etc.) surface consistently. Callers that depended on the catalog-only behavior (cost-sensitive cloud setups where every `ready()` would now bill prompt tokens) can opt back in by passing `readiness_probe="models"`.

diff --git a/examples/03-observer-hooks/main.py b/examples/03-observer-hooks/main.py
@@ -35,6 +35,46 @@
     LLM_API_KEY=sk-... uv run python main.py "explain why NASA is returning to the moon with Artemis"
 
 (``--all-extras`` pulls in ``opentelemetry-sdk`` for the OTel observer.)
+
+**Production swap: real OTLP exporter (e.g. HyperDX).**
+
+The example wires ``OTelObserver`` to a ``SimpleSpanProcessor`` +
+``ConsoleSpanExporter`` so every span prints to stdout. That is fine
+for a short-lived demo and wrong for production: synchronous export
+blocks each node boundary, and printing is not ingestion. For a real
+backend (HyperDX, Honeycomb, Tempo, any OTLP-HTTP collector), swap to
+``BatchSpanProcessor`` + ``OTLPSpanExporter`` pointing at your
+collector and supplying its auth header. The HyperDX shape::
+
+    from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
+    from opentelemetry.sdk.trace.export import BatchSpanProcessor
+
+    otel_observer = OTelObserver(
+        span_processor=BatchSpanProcessor(
+            OTLPSpanExporter(
+                endpoint="https://in-otel.hyperdx.io/v1/traces",
+                # HyperDX accepts the API key as a bare ``authorization``
+                # value. Other collectors expect ``Bearer <token>``;
+                # check your destination's docs. The bracket-form
+                # ``os.environ[...]`` is intentional: unlike ``LLM_API_KEY``
+                # (which permits None for unauthenticated local servers),
+                # a missing HyperDX key would silently send unauthenticated
+                # requests, so fail-loud at boot is the right shape.
+                headers={"authorization": os.environ["HYPERDX_API_KEY"]},
+            )
+        ),
+        resource=Resource.create({"service.name": "openarmature-demo-answers"}),
+    )
+
+Same observer call surface; only the processor + exporter change. The
+``OTLPSpanExporter`` lives in the ``opentelemetry-exporter-otlp-proto-http``
+package (not in ``[otel]`` extras yet; install it directly while OA
+gauges demand). Before short-lived processes exit, call
+``await graph.drain()`` (drains the observer's per-invocation event
+queue so spans see their ``completed`` events) and then
+``otel_observer.force_flush()`` (synchronous; pushes
+``BatchSpanProcessor``'s tail through the exporter). The drain + flush
+pair ensures the tail lands before teardown.
 """
 
 from __future__ import annotations

diff --git a/pyproject.toml b/pyproject.toml
@@ -75,6 +75,11 @@ dev = [
     "pyyaml>=6.0",
     "ruff>=0.5",
     "types-pyyaml",
+    # Used only by ``tests/integration/test_otel_hyperdx_export.py``
+    # against a live HyperDX endpoint. Not promoted to a public extras
+    # group yet: one downstream user, one destination; revisit when
+    # multiple users want OTLP-HTTP export packaged.
+    "opentelemetry-exporter-otlp-proto-http>=1.27,<3",
 ]
 docs = [
     "mkdocs>=1.6,<2",

diff --git a/tests/integration/test_otel_hyperdx_export.py b/tests/integration/test_otel_hyperdx_export.py
@@ -0,0 +1,125 @@
+"""Integration test for OTel span export against a live HyperDX endpoint.
+
+Gated by the presence of ``HYPERDX_API_KEY`` + ``HYPERDX_OTLP_ENDPOINT``
+env vars. Skipped in CI and local runs that don't have credentials in
+scope; runs end-to-end against HyperDX Cloud (or any other OTLP-HTTP
+collector) when invoked from a shell with both env vars sourced.
+
+``HYPERDX_OTLP_ENDPOINT`` MUST be the full traces-collector URL
+including the ``/v1/traces`` path suffix, e.g.::
+
+    HYPERDX_OTLP_ENDPOINT=https://in-otel.hyperdx.io/v1/traces
+
+``OTLPSpanExporter`` uses the ``endpoint`` kwarg verbatim and does
+not append the path itself (that auto-append only happens for the
+``OTEL_EXPORTER_OTLP_ENDPOINT`` host-only convention this test does
+not use). A host-only URL POSTs to ``/`` and HyperDX 404s.
+
+The test verifies the production export path the documentation
+recommends (``BatchSpanProcessor`` + ``OTLPSpanExporter``) drains
+cleanly from the local pipeline. The assertion is local-side: the
+BatchSpanProcessor's ``force_flush`` succeeded within the deadline.
+HyperDX-side acceptance (auth, payload accepted, span visible in the
+UI) is verified by checking the HyperDX UI for a span named ``ping``
+under service ``openarmature-hyperdx-integration``; the OTel SDK
+swallows exporter errors silently, so a local-side success does not
+prove the collector received the spans.
+"""
+
+from __future__ import annotations
+
+import os
+
+import pytest
+
+# Skip the entire module when credentials / endpoint aren't sourced.
+# Avoids an ImportError cascade from the OTLP exporter if its env-var
+# fallback also can't find a target.
+pytestmark = pytest.mark.skipif(
+    not (os.environ.get("HYPERDX_API_KEY") and os.environ.get("HYPERDX_OTLP_ENDPOINT")),
+    reason=(
+        "Requires HYPERDX_API_KEY + HYPERDX_OTLP_ENDPOINT (live HyperDX endpoint); "
+        "endpoint MUST include the /v1/traces path suffix"
+    ),
+)
+
+
+@pytest.mark.integration
+async def test_otel_observer_pipeline_drains_with_hyperdx_exporter() -> None:
+    """End-to-end: invoke a tiny graph under an OTelObserver wired to
+    the OTLPSpanExporter pointing at the configured HyperDX endpoint,
+    flush, and assert the local pipeline drained within the deadline.
+    """
+    # Imports inside the function so the heavy OTLP-protobuf
+    # dependencies don't load when the module is collected and skipped
+    # under the default ``-m "not integration"`` pytest filter.
+    from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
+    from opentelemetry.sdk.resources import Resource
+    from opentelemetry.sdk.trace.export import BatchSpanProcessor
+
+    from openarmature.graph import END, GraphBuilder, State
+    from openarmature.observability.otel import OTelObserver
+
+    # Enforce the documented endpoint shape at runtime. The
+    # ``OTLPSpanExporter`` uses the URL verbatim and does not append
+    # ``/v1/traces`` itself, so a host-only URL POSTs to ``/`` and
+    # HyperDX 404s; the SDK swallows that response and ``force_flush``
+    # still returns True, which would mask a misconfigured env var
+    # behind a passing test.
+    endpoint = os.environ["HYPERDX_OTLP_ENDPOINT"]
+    assert endpoint.endswith("/v1/traces"), (
+        f"HYPERDX_OTLP_ENDPOINT must end with /v1/traces (got {endpoint!r}); "
+        "OTLPSpanExporter uses the URL verbatim and does not append paths."
+    )
+
+    # HyperDX accepts the API key as a bare ``authorization`` header
+    # value (no ``Bearer`` prefix). Other OTLP collectors that expect
+    # ``Bearer <token>`` will need the caller to format the header
+    # themselves; this is the documented HyperDX shape.
+    exporter = OTLPSpanExporter(
+        endpoint=endpoint,
+        headers={"authorization": os.environ["HYPERDX_API_KEY"]},
+    )
+
+    observer = OTelObserver(
+        span_processor=BatchSpanProcessor(exporter),
+        resource=Resource.create({"service.name": "openarmature-hyperdx-integration"}),
+    )
+
+    class _PingState(State):
+        ping: bool = False
+
+    async def _node(_s: _PingState) -> dict[str, bool]:
+        return {"ping": True}
+
+    graph = GraphBuilder(_PingState).add_node("ping", _node).add_edge("ping", END).set_entry("ping").compile()
+    graph.attach_observer(observer)
+
+    try:
+        final = await graph.invoke(_PingState())
+        assert final.ping is True
+
+        # ``invoke()`` returns when the graph reaches END but observer
+        # events sit on a per-invocation queue until the background
+        # worker drains them. Without ``drain()``, a span that hasn't
+        # yet seen its ``completed`` event is still open when
+        # ``force_flush`` runs, and the exporter would ship only the
+        # ``started`` half (or nothing at all). The short-lived-process
+        # pattern in ``docs/agent/non-obvious-shapes.md`` makes this
+        # explicit.
+        await graph.drain()
+
+        # Local-side assertion. ``BatchSpanProcessor.force_flush``
+        # returns True when every registered processor finishes
+        # flushing within the timeout, False when any one times out.
+        # The OTel SDK swallows exporter-side errors (401s, schema
+        # rejections) silently, so a True here proves the pipeline
+        # drained but not that HyperDX accepted the payload; that
+        # confirmation is in the HyperDX UI.
+        flushed = observer.force_flush(timeout_ms=15_000)
+        assert flushed, "BatchSpanProcessor did not finish flushing within 15s"
+    finally:
+        # Releases the BatchSpanProcessor's background export thread;
+        # ``OTelObserver.shutdown`` is idempotent and calls
+        # ``_provider.shutdown`` under the hood.
+        observer.shutdown()
diff --git a/uv.lock b/uv.lock