Skip to content

Add low-latency raw search path separate from agentic answer synthesis #163

@ishaanxgupta

Description

@ishaanxgupta

The retrieval uses LLM tool selection plus storage calls plus LLM synthesis in retrieval.py. This is quality-friendly but latency-heavy.

Acceptance criteria:

  • Add /search fast path returning ranked profile/summary/temporal/snippet/code hits without synthesis.
  • Make LLM answer generation optional via answer=true.
  • Cache profile catalogs and retrieval plans.
  • Track p50/p95/p99 latency per retrieval mode.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions