Skip to content

pretext: spec POST /pipelines/{id}/encode#155

Open
beckyconning wants to merge 1 commit into
masterfrom
bc-pipeline-encode-route
Open

pretext: spec POST /pipelines/{id}/encode#155
beckyconning wants to merge 1 commit into
masterfrom
bc-pipeline-encode-route

Conversation

@beckyconning
Copy link
Copy Markdown
Contributor

Summary

Documents the new tenant-authed query-embedding route added in precog/services#3504.

  • Body: { \"text\": \"...\" }
  • Response: forwarded verbatim from the in-cluster vectorizer-encoder — { \"vec\": [...], \"dim\": 768, \"model_id\": \"...\" }
  • Returns 200 on success, 400 on empty text, 401 if unauthed, 404 if the caller can't see the pipeline, 502 if the encoder is unreachable

Slotted next to the existing vectorize PUT — same auth model, same family.

Why

External clients (notably the MCP server's semantic_search tool) need to embed user queries against the same sentence-transformers/all-mpnet-base-v2 model the corpus was vectorized with. Running the embedder in-process required @xenova/transformers + onnxruntime-node in a Vercel serverless function, which has been the failure mode behind recent "Failed to embed the query" errors. Exposing the encoder behind the public API gets us tenant auth + a single dedicated model pod instead.

Test plan

  • yaml lints cleanly
  • Confirm shape against actual response from staging /pipelines/{id}/encode once precog/services#3504 lands

🤖 Generated with Claude Code

Documents the new tenant-authed query-embedding route added in
precog/services#3504. Body is `{"text": "..."}` and the response is
the verbatim encoder envelope (`{"vec":[...],"dim":768,"model_id":"..."}`).

Slotted next to the vectorize PUT — same auth model, same family.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant