LLMQuant · AdairBear · Jun 12, 2026 · Jun 12, 2026 · Jun 24, 2026
diff --git a/.gitignore b/.gitignore
@@ -32,3 +32,4 @@ docs/superpowers/
 .coverage
 htmlcov/
 coverage.xml
+.venv/
diff --git a/QM_MCP_ENGINEERING_LOG.md b/QM_MCP_ENGINEERING_LOG.md
@@ -0,0 +1,23 @@
+# qm_mcp engineering log
+
+Append-only record of notable changes to the `qm_mcp/` research-corpus layer
+(Thomas's additive layer on top of LLMQuant/quant-mind). Upstream `quantmind/`
+history lives in the normal git log.
+
+## 2026-06-12 — Phase 4 landing: qm_mcp merged to master
+
+- **PR [#1](https://github.com/AdairBear/quant-mind/pull/1)** squash-merged →
+  `9b8a9599d5e00f61f9b2c2e883a02ecf1b0aa90c`.
+- Adds the persistence + embedding + semantic-query + MCP layer
+  (`store.py`, `embed.py`, `ingest.py`, `query.py`, `server.py`, `cli.py`,
+  `seed_corpus.txt`, `_smoke_mcp.py`) that QuantMind v0.2 does not yet ship.
+- Companion hermes-agent side: PR
+  [#10](https://github.com/AdairBear/hermes-agent/pull/10) →
+  `84314fa7eec991eccea8a59024c79f3cef53efbc` (the `#research` channel router +
+  `docs/quantmind_brain_boundary.md`).
+- Landed in a **new private** `AdairBear/quant-mind` repo (origin left pointing
+  at upstream `LLMQuant/quant-mind`; `fork` remote added).
+- Verified: direct stdio MCP call enumerates all 7 tools and `qm_query` returns
+  grounded, cited answers; corpus live (33 items incl. Databento
+  futures-microstructure articles). Live-gateway pickup pending an operator
+  restart (see `quantmind_brain_boundary.md` in hermes-agent for the open item).
diff --git a/docs/papers.md b/docs/papers.md
diff --git a/docs/quant-scholar.json b/docs/quant-scholar.json
diff --git a/qm_mcp/README.md b/qm_mcp/README.md
@@ -0,0 +1,86 @@
+# qm_mcp — QuantMind research-corpus surface
+
+This package turns [QuantMind](../README.md) into a **queryable research
+corpus** for Thomas's trading + AVST work, exposed over MCP so Personal
+Hermes, Dispatch sessions, the Conductor, and future Akazi AVST all read the
+same knowledge base.
+
+## Why this exists
+
+QuantMind v0.2 ships **ingestion + LLM extraction only** — `paper_flow`
+fetches an arXiv id / URL / PDF / raw text, converts it to markdown, and
+extracts a typed `Paper` tree. Its persistence, embedding, semantic-query,
+and "Data MCP" layers are still **vision / future PRs** (PR6/PR7 per their
+README). `qm_mcp` supplies exactly that missing Stage-2 layer:
+
+```
+ingest (QuantMind paper_flow)
+   → CorpusStore   (~/.quantmind/corpus : one JSON + one vector per item)
+   → semantic query (OpenAI embeddings → cosine top-k → grounded answer)
+   → MCP server    (qm_ingest_*, qm_query, qm_list_corpus, qm_delete_item)
+```
+
+It is dependency-light: it reuses QuantMind's own venv (`openai`, `numpy`,
+`pydantic`, `httpx`, `mcp`) and stores everything on the local filesystem.
+
+## Secrets
+
+Loaded from `~/.hermes/.env` at runtime — nothing is hard-coded. Embeddings
+and `paper_flow` extraction need a **real platform.openai.com** key. Hermes'
+`OPENAI_API_KEY` is an OpenRouter key (`sk-or-…`, no embeddings endpoint), so
+`qm_mcp` uses `VOICE_TOOLS_OPENAI_KEY` (the real OpenAI key kept for Whisper)
+and forces it for this process only.
+
+## Run the MCP server
+
+```bash
+/Users/thomasadair/projects/quant-mind/.venv/bin/python -m qm_mcp.server
+```
+
+Registered in Hermes `~/.hermes/config.yaml` under `mcp_servers: quantmind`
+(see `docs/quantmind_brain_boundary.md` in the hermes-agent repo).
+
+## CLI (seeding + shell use)
+
+```bash
+PY=/Users/thomasadair/projects/quant-mind/.venv/bin/python
+$PY -m qm_mcp.cli ingest-arxiv 1105.3115
+$PY -m qm_mcp.cli ingest-pdf  ~/papers/foo.pdf
+$PY -m qm_mcp.cli ingest-url  https://example.com/article
+$PY -m qm_mcp.cli seed        qm_mcp/seed_corpus.txt
+$PY -m qm_mcp.cli query       "What does Stoikov say about gamma?"
+$PY -m qm_mcp.cli list
+$PY -m qm_mcp.cli delete      <item_id>
+```
+
+## MCP tools
+
+| Tool | Purpose |
+|---|---|
+| `qm_ingest_arxiv(arxiv_id)` | Ingest an arXiv paper by id or URL |
+| `qm_ingest_url(url)` | Ingest a web page / hosted PDF |
+| `qm_ingest_pdf(path)` | Ingest a local PDF / HTML / Markdown file |
+| `qm_ingest_text(text, title?)` | Ingest pasted text |
+| `qm_query(question, k=5)` | Grounded natural-language answer + top-k sources |
+| `qm_list_corpus()` | List all ingested items (metadata) |
+| `qm_delete_item(item_id)` | Remove one item |
+
+## Storage
+
+`~/.quantmind/corpus/` (outside both git repos — never committed):
+- `items/<id>.json` — record: metadata + flattened context + full Paper tree
+- `vectors/<id>.npy` — 1536-dim embedding (aligned by id)
+- `ingestion_log.jsonl` — append-only ledger of ingestion events
+
+`id` is a stable hash of the source, so re-ingesting is idempotent (dedup).
+
+## Known QuantMind quirks handled here
+
+- **Strict-schema rejection.** `Agent(output_type=Paper)` fails under OpenAI
+  strict structured output (recursive UUID-keyed tree). We pass a non-strict
+  `AgentOutputSchema(Paper, strict_json_schema=False)`.
+- **No news flow.** QuantMind has `knowledge/news.py` types but no
+  `news_flow`. News/blog URLs go through the generic `HttpUrl` → `paper_flow`
+  path (trafilatura HTML → markdown → extraction).
+- **DOI unsupported.** `paper_flow` raises `NotImplementedError` on DOI
+  inputs upstream; use arXiv id or a direct URL.
diff --git a/qm_mcp/__init__.py b/qm_mcp/__init__.py
@@ -0,0 +1,18 @@
+"""qm_mcp — the research-corpus surface built on top of QuantMind ingestion.
+
+QuantMind v0.2 ships ingestion + LLM extraction only (``paper_flow``); the
+persistence, embedding, semantic-query, and MCP layers (its "Stage 2 /
+Data MCP" vision) are not yet built upstream. This package supplies exactly
+that missing layer so QuantMind becomes a usable, queryable corpus for
+Thomas's trading + AVST research:
+
+    ingest (paper_flow)  ->  CorpusStore (JSON + vectors)  ->  semantic query
+                                      \\-> MCP server (Hermes / Dispatch / Conductor)
+
+It is intentionally self-contained and dependency-light: it reuses
+QuantMind's own venv (openai, numpy, pydantic, httpx, mcp) and stores the
+corpus on the local filesystem under ``QM_CORPUS_DIR``.
+"""
+
+__all__ = ["__version__"]
+__version__ = "0.1.0"
diff --git a/qm_mcp/_smoke_mcp.py b/qm_mcp/_smoke_mcp.py
@@ -0,0 +1,32 @@
+"""Standalone MCP stdio smoke test: spawn the server, list tools, list corpus.
+
+Run under the QuantMind venv:
+    python -m qm_mcp._smoke_mcp
+"""
+
+from __future__ import annotations
+
+import asyncio
+import os
+
+from mcp import ClientSession, StdioServerParameters
+from mcp.client.stdio import stdio_client
+
+
+async def main() -> None:
+    params = StdioServerParameters(
+        command=os.sys.executable,
+        args=["-m", "qm_mcp.server"],
+        env={**os.environ, "PYTHONPATH": os.getcwd()},
+    )
+    async with stdio_client(params) as (read, write):
+        async with ClientSession(read, write) as session:
+            await session.initialize()
+            tools = await session.list_tools()
+            print("TOOLS:", [t.name for t in tools.tools])
+            res = await session.call_tool("qm_list_corpus", {})
+            print("LIST_CORPUS:", res.content[0].text[:400])
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/qm_mcp/cli.py b/qm_mcp/cli.py
@@ -0,0 +1,126 @@
+"""Command-line surface for the QuantMind corpus.
+
+Used for seeding the initial corpus, manual queries, and as a shell-callable
+backend for any tool that prefers a subprocess over MCP. Examples::
+
+    python -m qm_mcp.cli ingest-arxiv 1105.3115
+    python -m qm_mcp.cli ingest-pdf ~/papers/foo.pdf
+    python -m qm_mcp.cli seed papers.txt          # one source per line
+    python -m qm_mcp.cli query "What is gamma in Avellaneda-Stoikov?"
+    python -m qm_mcp.cli list
+    python -m qm_mcp.cli delete <id>
+"""
+
+from __future__ import annotations
+
+import argparse
+import asyncio
+import json
+import sys
+from pathlib import Path
+
+from qm_mcp import ingest as I
+from qm_mcp.query import query as run_query
+from qm_mcp.store import CorpusStore
+
+
+def _print(obj) -> None:
+    print(json.dumps(obj, indent=2, default=str))
+
+
+async def _dispatch_source(src: str, *, force: bool):
+    """Route one seed line to the right ingest fn by simple heuristics."""
+    s = src.strip()
+    if not s or s.startswith("#"):
+        return None
+    # Strip inline "# comment" trailers (seed files annotate ids). URLs may
+    # legitimately contain '#', so only strip for non-URL lines.
+    if not s.lower().startswith("http"):
+        s = s.split("#", 1)[0].strip()
+    if not s:
+        return None
+    low = s.lower()
+    if (
+        low.startswith("arxiv:")
+        or low.startswith("http")
+        and "arxiv.org" in low
+    ):
+        return await I.ingest_arxiv(
+            s.split("arxiv:", 1)[-1].strip(), force=force
+        )
+    if low.startswith("http://") or low.startswith("https://"):
+        return await I.ingest_url(s, force=force)
+    if Path(s).expanduser().is_file():
+        return await I.ingest_pdf(s, force=force)
+    # bare token -> treat as arxiv id
+    return await I.ingest_arxiv(s, force=force)
+
+
+async def _amain(args: argparse.Namespace) -> int:
+    if args.cmd == "ingest-arxiv":
+        _print(await I.ingest_arxiv(args.value, force=args.force))
+    elif args.cmd == "ingest-url":
+        _print(await I.ingest_url(args.value, force=args.force))
+    elif args.cmd == "ingest-pdf":
+        _print(await I.ingest_pdf(args.value, force=args.force))
+    elif args.cmd == "ingest-text":
+        _print(await I.ingest_text(args.value, force=args.force))
+    elif args.cmd == "seed":
+        lines = Path(args.value).read_text(encoding="utf-8").splitlines()
+        results = []
+        for line in lines:
+            try:
+                res = await _dispatch_source(line, force=args.force)
+            except Exception as exc:  # one bad source must not sink the batch
+                res = {
+                    "source": line.strip(),
+                    "status": "error",
+                    "error": str(exc),
+                }
+            if res is not None:
+                results.append(res)
+                print(
+                    f"  [{res.get('status'):>8}] {res.get('title') or res.get('source') or res.get('error')}",
+                    file=sys.stderr,
+                )
+        _print({"seeded": results, "total": len(results)})
+    elif args.cmd == "query":
+        _print(await run_query(args.value, k=args.k))
+    elif args.cmd == "list":
+        store = CorpusStore()
+        _print({"count": len(store), "items": store.list_records(light=True)})
+    elif args.cmd == "delete":
+        _print({"id": args.value, "deleted": CorpusStore().delete(args.value)})
+    else:  # pragma: no cover
+        return 2
+    return 0
+
+
+def main() -> int:
+    p = argparse.ArgumentParser(
+        prog="qm_mcp.cli", description="QuantMind corpus CLI"
+    )
+    p.add_argument(
+        "--force", action="store_true", help="re-ingest even if present"
+    )
+    sub = p.add_subparsers(dest="cmd", required=True)
+    for name in (
+        "ingest-arxiv",
+        "ingest-url",
+        "ingest-pdf",
+        "ingest-text",
+        "seed",
+        "delete",
+    ):
+        sp = sub.add_parser(name)
+        sp.add_argument("value")
+    qp = sub.add_parser("query")
+    qp.add_argument("value")
+    qp.add_argument("-k", type=int, default=5)
+    sub.add_parser("list")
+    args = p.parse_args()
+    return asyncio.run(_amain(args))
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/qm_mcp/config.py b/qm_mcp/config.py
@@ -0,0 +1,97 @@
+"""Configuration + secret loading for the QuantMind corpus surface.
+
+Secrets are NOT hard-coded here. The OpenAI key (used by QuantMind's
+``paper_flow`` extraction and by our embedding/synthesis calls) is loaded
+from the canonical Hermes gateway env file ``~/.hermes/.env`` if present,
+then from the process environment. This mirrors the Phase 3 Doppler/`.env`
+pattern: the running gateway already owns these secrets.
+"""
+
+from __future__ import annotations
+
+import os
+from pathlib import Path
+
+# Canonical secret source: the always-on Hermes gateway env file.
+_HERMES_ENV = Path.home() / ".hermes" / ".env"
+
+# Embedding + synthesis models. text-embedding-3-small is 1536-dim, cheap,
+# and good enough for a coarse semantic pre-filter over a research corpus.
+EMBED_MODEL = os.environ.get("QM_EMBED_MODEL", "text-embedding-3-small")
+SYNTH_MODEL = os.environ.get("QM_SYNTH_MODEL", "gpt-4o-mini")
+# Extraction model for paper_flow. gpt-4o-mini keeps per-paper cost to cents.
+EXTRACT_MODEL = os.environ.get("QM_EXTRACT_MODEL", "gpt-4o-mini")
+
+# Embedding input ceiling (chars). text-embedding-3-small caps at ~8191
+# tokens; ~24k chars (~6k tokens) leaves comfortable headroom.
+EMBED_CHAR_LIMIT = 24_000
+# Synthesis context ceiling (chars) across all retrieved sources.
+SYNTH_CONTEXT_CHAR_LIMIT = 14_000
+
+
+def corpus_dir() -> Path:
+    """Root directory for the persisted corpus (items + vectors)."""
+    raw = os.environ.get("QM_CORPUS_DIR")
+    base = (
+        Path(raw).expanduser()
+        if raw
+        else (Path.home() / ".quantmind" / "corpus")
+    )
+    base.mkdir(parents=True, exist_ok=True)
+    return base
+
+
+def load_secrets() -> None:
+    """Load OPENAI_API_KEY (and friends) from ~/.hermes/.env into os.environ.
+
+    Existing process-env values win — we only fill gaps. This never prints
+    or returns the secret value.
+    """
+    if not _HERMES_ENV.is_file():
+        return
+    try:
+        for line in _HERMES_ENV.read_text(encoding="utf-8").splitlines():
+            line = line.strip()
+            if not line or line.startswith("#") or "=" not in line:
+                continue
+            key, _, val = line.partition("=")
+            key = key.strip()
+            val = val.strip().strip('"').strip("'")
+            if key and key not in os.environ:
+                os.environ[key] = val
+    except OSError:
+        # Secret file unreadable — fall through to whatever is already in
+        # the environment. The OpenAI client will raise a clear error if the
+        # key is genuinely absent.
+        pass
+
+    # CRITICAL: Hermes' OPENAI_API_KEY is an OpenRouter key (sk-or-...). That
+    # 401s against api.openai.com and OpenRouter exposes no embeddings
+    # endpoint. The real platform.openai.com key is stored separately as
+    # VOICE_TOOLS_OPENAI_KEY (used for Whisper). Force it as the OpenAI key
+    # for THIS process only so both QuantMind's openai-agents extraction and
+    # our embeddings/synthesis hit real OpenAI. We also clear any OpenAI base
+    # URL so the client cannot be redirected to OpenRouter.
+    real = os.environ.get("VOICE_TOOLS_OPENAI_KEY", "").strip()
+    if real:
+        os.environ["OPENAI_API_KEY"] = real
+        os.environ.pop("OPENAI_BASE_URL", None)
+
+
+def require_openai_key() -> str:
+    """Return the real OpenAI key or raise a clear, actionable error."""
+    load_secrets()
+    key = os.environ.get("OPENAI_API_KEY", "").strip()
+    if not key:
+        raise RuntimeError(
+            "No OpenAI key available. QuantMind ingestion + corpus embedding "
+            "need a real platform.openai.com key. Set VOICE_TOOLS_OPENAI_KEY "
+            "(preferred) or OPENAI_API_KEY in ~/.hermes/.env."
+        )
+    if key.startswith("sk-or-"):
+        raise RuntimeError(
+            "The active OpenAI key is an OpenRouter key (sk-or-...), which "
+            "cannot do embeddings or reach api.openai.com. Set "
+            "VOICE_TOOLS_OPENAI_KEY to a real platform.openai.com key."
+        )
+    return key