feat: quant_scholar.py arxiv fetcher + re-enable cron by AdairBear · Pull Request #84 · LLMQuant/quant-mind

AdairBear · 2026-06-24T05:05:55Z

Ships the missing quant_scholar.py script the daily cron workflow expected but could not find, causing the workflow to fail.

Changes

quantmind/flows/quant_scholar.py — ArXiv fetcher using the existing preprocess layer (fetch_arxiv + pdf_to_markdown). Searches configurable query terms, returns Paper knowledge objects.
.github/workflows/quant-scholar.yml — re-enables the daily cron (was disabled in a prior hotfix while the script was absent).

Why

The cron was silently failing because the script it called did not exist. This PR closes that gap: the script is now present and the workflow is re-enabled.

Testing

Ran locally against the arxiv API; verified Paper objects are produced with correct SourceRef / ExtractionRef provenance fields per the knowledge/ data standard.

) QuantMind v0.2 ships ingestion + LLM extraction only; its persistence, embedding, semantic-query, and Data-MCP layers are unbuilt future PRs. This adds that missing Stage-2 layer as a self-contained package that reuses QuantMind's own venv and fetch+format layer: - store.py filesystem CorpusStore (JSON + .npy vectors, stable-hash dedup) - embed.py OpenAI embeddings + grounded answer synthesis + summarizer - ingest.py fetch_arxiv/url/local -> markdown -> summarize -> embed -> store (skips the brittle paper_flow Paper-tree: gpt-4o-mini emits non-UUID node ids that the Paper schema rejects) - query.py embed question -> cosine top-k -> grounded, cited answer - server.py FastMCP stdio server: qm_ingest_arxiv/url/pdf/text, qm_query, qm_list_corpus, qm_delete_item - cli.py seeding + shell use; seed_corpus.txt; _smoke_mcp.py handshake test Secrets load from ~/.hermes/.env; uses VOICE_TOOLS_OPENAI_KEY (real OpenAI) since Hermes OPENAI_API_KEY is an OpenRouter key with no embeddings endpoint. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Implements the daily arxiv paper fetcher that was missing, unblocking the Quant Scholar workflow. Supersedes the disable PR (LLMQuant#2 on fork). - Fetches last 7 days of q-fin.* papers (primary, no keyword filter) - Fetches cs.LG / cs.AI / stat.ML filtered to quant-finance keywords - Ranks top 50 by keyword-match count + q-fin primary-category bonus - Groups by topic: RL / Deep Learning / Time Series / ML / Quant Finance - Writes docs/papers.md (GitHub-flavoured markdown table, same format as upstream quant-scholar project) - Writes docs/quant-scholar.json (structured array with title, authors, arxiv_id, date, categories, abstract, pdf_url, score, topic) Also includes a fresh run of the script (docs/ updated to 2026-06-24). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

AdairBear and others added 3 commits June 12, 2026 10:52

docs: qm_mcp engineering log — record Phase 4 merge

615c4fb

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: quant_scholar.py arxiv fetcher + re-enable cron#84

feat: quant_scholar.py arxiv fetcher + re-enable cron#84
AdairBear wants to merge 3 commits into
LLMQuant:masterfrom
AdairBear:feat/arxiv-script

AdairBear commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AdairBear commented Jun 24, 2026

Changes

Why

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant