Add raw search fast path with code domain support by michael-schvarcz · Pull Request #174 · XortexAI/XMem

michael-schvarcz · 2026-05-11T18:38:22Z

Addresses #163.

This adds a low-latency raw search path that returns ranked memory hits without the agentic LLM planning/synthesis round, while keeping answer generation optional through answer=true.

What changed:

Adds RetrievalPipeline.raw_search for direct profile, temporal, summary, and snippet retrieval without tool selection.
Extends /v1/memory/search to optionally include code hits by querying the existing code retrieval pipeline's native symbol/file search tools.
Adds answer=true synthesis after raw hits are collected, without a second planning step.
Caches profile catalogs and agentic retrieval plans.
Tracks bounded p50/p95/p99 latency snapshots by retrieval mode.
Adds tests proving raw search bypasses LLM tool planning, answer mode only synthesizes after hits, and cached plans are reused.

Compared with the currently open #173, this implementation also wires the requested code domain into the raw search endpoint.

Validation:

.venv/bin/python -m pytest tests/integration/test_retrieval_pipeline.py tests/api/test_dependencies_and_routes.py -q
9 passed

git diff --check
passed

gemini-code-assist

Code Review

This PR expands the memory search capabilities by adding a "code" domain for repository-level searches and introducing a fast raw_search path that avoids LLM planning. Key architectural additions include a latency tracking system for performance monitoring and caching mechanisms for retrieval plans and profile catalogs. The review feedback highlights several critical improvements: addressing a potential TypeError when rounding null scores, implementing bounded caches to prevent memory leaks, maintaining proper encapsulation of private methods, and enhancing the robustness of concurrent tool execution through better error handling.

gemini-code-assist · 2026-05-11T18:40:09Z

+                SourceRecord(
+                    domain=s.domain,
+                    content=s.content,
+                    score=round(s.score, 3),


The score attribute of a SourceRecord can be None (as handled in the sorting logic on line 728). Attempting to call round(None, 3) will raise a TypeError. It is safer to provide a default value of 0.0 if the score is missing.

Suggested change

score=round(s.score, 3),

score=round(s.score, 3) if s.score is not None else 0.0,

gemini-code-assist · 2026-05-11T18:40:09Z

+        self._profile_catalog_cache: Dict[
+            str, Tuple[float, List[Dict[str, str]], List[Any]]
+        ] = {}
+        self._retrieval_plan_cache: Dict[str, List[Dict[str, Any]]] = {}


The _profile_catalog_cache and _retrieval_plan_cache are implemented as unbounded dictionaries. In a long-running API process, these will grow indefinitely as different users and queries are processed, potentially leading to memory exhaustion. It is recommended to use a cache with a maximum size or an eviction policy (e.g., cachetools.TTLCache or a simple LRU implementation).

gemini-code-assist · 2026-05-11T18:40:09Z

-        data = SearchResponse(results=all_results, total=len(all_results))
+        answer = ""
+        if req.answer:
+            answer = await pipeline._answer_from_sources(query=req.query, sources=records)


This line accesses _answer_from_sources, which is a private method of the RetrievalPipeline class. To maintain proper encapsulation, this method should be made public (e.g., renamed to answer_from_sources) if it is intended to be used outside the class.

gemini-code-assist · 2026-05-11T18:40:09Z

+        if include_answer:
+            answer = await self._answer_from_sources(query=query, sources=sources)
+
+        confidence = min(1.0, len(sources) * 0.15) if sources else 0.1


The confidence calculation logic here (len(sources) * 0.15) is inconsistent with the logic used in the agentic run method on line 308 (len(sources) * 0.2). Consider unifying this logic or defining the multiplier as a constant to ensure consistent confidence scoring across different retrieval modes.

gemini-code-assist · 2026-05-11T18:40:09Z

+            code_results = await asyncio.gather(
+                code_pipeline._execute_tool(
+                    tool_name="search_symbols",
+                    tool_args={"query": req.query, "repo": req.repo},
+                    repo=req.repo,
+                    top_k=req.top_k,
+                    user_id=user_id,
+                ),
+                code_pipeline._execute_tool(
+                    tool_name="search_files",
+                    tool_args={"query": req.query, "repo": req.repo},
+                    repo=req.repo,
+                    top_k=req.top_k,
+                    user_id=user_id,
+                ),
+            )


If any of the code retrieval tool calls fail, the entire search request will return a 500 error. For a more robust search experience, consider using asyncio.gather(..., return_exceptions=True) and filtering out failed results, allowing the user to see memory hits even if the code domain search is temporarily unavailable.

michael-schvarcz · 2026-05-12T13:46:32Z

Updated this PR to address the review points:

Normalizes missing source scores through a shared helper before API serialization.
Keeps raw memory search resilient when one requested backend fails by using asyncio.gather(..., return_exceptions=True) and returning healthy domain results.
The current head already uses bounded LRU-style caches, public answer_from_sources / confidence_from_sources helpers, and the shared confidence multiplier.

Validation: .venv/bin/python -m pytest tests/integration/test_retrieval_pipeline.py tests/api/test_dependencies_and_routes.py -> 9 passed.

Ankit-Kotnala · 2026-05-13T12:53:42Z

I rechecked the Gemini items and they look addressed. One remaining edge case I noticed is in src/pipelines/retrieval.py::_format_tool_results: API serialization now normalizes None scores, but answer=true can still fail before serialization because answer_from_sources() formats sources via _format_tool_results(), which does rec.score > 0 / {rec.score:.2f}.

If any backend returns SourceRecord(score=None), raw search without answer works, but raw search with answer=true can raise there. Can we normalize the score inside _format_tool_results() too, e.g. score = rec.score or 0.0, before formatting?

After that, this looks good to me.

Ankit-Kotnala

Overall this looks good and the Gemini review items appear addressed.

I left one blocking edge-case comment around _format_tool_results(): answer=true can still fail if any backend returns SourceRecord(score=None), because the answer formatting path uses rec.score > 0 / {rec.score:.2f} before API serialization normalizes scores.

Once that score normalization is handled in _format_tool_results(), I’m happy with this PR.

michael-schvarcz · 2026-05-13T13:11:12Z

Fixed the remaining missing-score edge case in _format_tool_results() by normalizing None to 0.0 before score comparison/formatting, so answer=true no longer crashes before API serialization.

Added regression coverage for answer_from_sources() with SourceRecord(score=None).

Validation:

.venv/bin/python -m pytest tests/integration/test_retrieval_pipeline.py tests/api/test_dependencies_and_routes.py -q
# 10 passed
git diff --check

michael-schvarcz requested review from ishaanxgupta and ved015 as code owners May 11, 2026 18:38

github-actions Bot added tests api pipelines labels May 11, 2026

michael-schvarcz mentioned this pull request May 11, 2026

Add low-latency raw search path separate from agentic answer synthesis #163

Open

gemini-code-assist Bot reviewed May 11, 2026

View reviewed changes

Add raw memory search fast path

bb08f07

michael-schvarcz force-pushed the codex/raw-search-fast-path branch from 8aa3426 to bb08f07 Compare May 11, 2026 18:43

Harden raw search score handling

d2df805

Ankit-Kotnala reviewed May 13, 2026

View reviewed changes

Ankit-Kotnala suggested changes May 13, 2026

View reviewed changes

Handle missing source scores in answer formatting

89fe3e4

michael-schvarcz requested a review from Ankit-Kotnala May 13, 2026 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add raw search fast path with code domain support#174

Add raw search fast path with code domain support#174
michael-schvarcz wants to merge 3 commits into
XortexAI:mainfrom
michael-schvarcz:codex/raw-search-fast-path

michael-schvarcz commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

gemini-code-assist Bot May 11, 2026

Uh oh!

michael-schvarcz commented May 12, 2026

Uh oh!

Ankit-Kotnala May 13, 2026

Uh oh!

Ankit-Kotnala left a comment

Uh oh!

michael-schvarcz commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	score=round(s.score, 3),
	score=round(s.score, 3) if s.score is not None else 0.0,

Conversation

michael-schvarcz commented May 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

michael-schvarcz commented May 12, 2026

Uh oh!

Ankit-Kotnala May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Ankit-Kotnala left a comment

Choose a reason for hiding this comment

Uh oh!

michael-schvarcz commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants