sid732 · sid732 · Jun 23, 2026 · Jun 23, 2026 · Jun 23, 2026 · Jun 23, 2026
diff --git a/.claude/skills/local-context-router/SKILL.md b/.claude/skills/local-context-router/SKILL.md
@@ -2,55 +2,57 @@
 name: local-context-router
 description: >-
   Preflight a PDF, scan, or screenshot locally before sending it to the model.
-  Extracts the embedded text layer for free, OCRs image-only pages on-device
-  with Apple Vision, and flags only genuinely visual pages (tables, charts,
-  diagrams) for the vision model — cutting vision-token cost. Use whenever the
-  user shares a PDF or image to read, summarize, or extract from.
+  Extracts the embedded text layer, OCRs image-only pages on-device with Apple
+  Vision, and flags genuinely visual pages (tables, charts, diagrams) for the
+  vision model, which cuts vision-token cost. Use whenever the user shares a PDF
+  or image to read, summarize, or extract from.
 ---
 
 # Local Context Router
 
-Multimodal models read a PDF by extracting its text *and* rendering every page
-to an image, billing for both. For text-heavy pages that is a 2–10× token tax
-for no added signal. This skill spends cheap local compute first and only pays
-for vision when a page's meaning actually lives in its pixels.
+A multimodal model reads a PDF by extracting its text and rendering every page to
+an image, then paying for both. On a page that is mostly prose, the image is
+wasted spend. Run this preflight first and send the model only what each page
+needs.
 
 ## When to use
 
-Use this **before** attaching a PDF, scan, or screenshot to the conversation —
-whenever the user wants you to read, summarize, or extract from a document.
+Before reading, summarizing, or extracting from a PDF, scan, or screenshot the
+user has shared.
 
-## How to run
+## Requirements
 
-Run the preflight script on the file. It picks the cheapest faithful source per
-page and prints the result as JSON:
+The `localcontextrouter` package must be installed (`pip install localcontextrouter`,
+macOS). It provides the `localctx` command used below.
+
+## Run
+
+Route the document and read the JSON, rendering any visual pages into a folder:
 
 ```sh
-python "${CLAUDE_SKILL_DIR}/scripts/preflight.py" <path-to-document> --json --vision-dir "${CLAUDE_SKILL_DIR}/.cache"
+localctx <path-to-document> --json --vision-dir ./lcr-pages
 ```
 
-- `<path-to-document>` is the PDF or image to analyze.
-- `--vision-dir` is where rendered images of visual pages are written.
+If `localctx` is not on the PATH, run the bundled script by its path inside this
+skill folder instead:
+
+```sh
+python scripts/preflight.py <path-to-document> --json --vision-dir ./lcr-pages
+```
 
-## How to use the result
+## Use the result
 
-The JSON has a `pages` array and a `tokens_saved` total. For each page:
+The JSON has `tokens_saved` and a `pages` array. Each page carries `source`,
+`text`, `text_tokens`, `image_tokens`, and `image`:
 
-- **`source: "text"`** — use the page's `text` directly. Do **not** attach the
-  image; it adds cost without information.
-- **`source: "ocr"`** — the page was image-only and has been OCR'd on-device;
-  use the returned `text`.
-- **`source: "vision"`** — the page is a table, chart, or diagram whose meaning
-  is visual. Attach the rendered image at `image` to the conversation so the
-  vision model can read it. The `text` is a rough fallback only.
+- `source: "text"`: use `text` directly; do not attach the image.
+- `source: "ocr"`: the page was image-only and has been OCR'd on-device; use `text`.
+- `source: "vision"`: the page is a table, chart, or diagram; attach the image at
+  `image` so the model can read it. The `text` is a rough fallback only.
 
-Assemble the per-page text in order for the parts you can read as text, and
-attach images only for the `vision` pages. Mention `tokens_saved` if the user
-cares about cost.
+Assemble the text and OCR pages in reading order, attach images only for the
+vision pages, and mention `tokens_saved` if the user cares about cost.
 
 ## Notes
 
-- Everything runs locally and offline; no document leaves the machine during
-  preflight.
-- Requires macOS (on-device OCR uses Apple Vision) and the `localcontextrouter`
-  package importable by the Python interpreter.
+Everything runs locally and offline; the document does not leave the machine.
diff --git a/.gitignore b/.gitignore
@@ -38,5 +38,5 @@ src/localcontextrouter/_bin/
 /tmp/
 *.log
 
-# Claude Code local (user-specific) settings
+# Local agent settings (user-specific)
 .claude/settings.local.json
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -21,15 +21,17 @@ First release.
   estimate, following each provider's documented tokenization.
 - `route_pdf`, which routes each page to text, OCR, or vision and reports the
   tokens saved versus sending every page as an image.
-- Routed text is normalized — stray control characters (e.g. PDF discretionary
-  hyphens) are stripped and line endings collapsed — while classification still
+- Routed text is normalized: stray control characters (such as PDF discretionary
+  hyphens) are stripped and line endings collapsed, while classification still
   runs on the raw text layer.
 - `localctx` command-line interface.
 - A `local-context-router` Agent Skill for Claude Code and Codex.
 
 ### Notes
 
-- macOS only; OCR uses the Apple Vision framework.
+- macOS only; OCR uses the Apple Vision framework and needs a normal macOS
+  graphics environment, so it will not run inside a headless sandbox that lacks
+  one.
 - The macOS wheel is a `universal2` platform wheel that bundles the `lcr-ocr`
   binary, so OCR works out of the box. `LCR_OCR_BIN` overrides the bundled copy.
 

diff --git a/README.md b/README.md
@@ -1,77 +1,111 @@
 # LocalContextRouter
 
-> Stop paying the vision-token tax. Decide locally — text, OCR, or vision — *before* a document ever reaches a multimodal LLM.
+Decide locally how each page of a document should reach a multimodal model:
+as extracted text, on-device OCR, or a rendered image. That keeps you from
+paying for vision tokens on pages that are only text.
 
-LocalContextRouter is a **preflight layer** for document-heavy LLM workflows. Given a
-PDF or image, it inspects the content on your machine and decides the cheapest path
-that still preserves accuracy:
+A multimodal model reads a PDF by pulling its text *and* rendering every page to
+an image, then billing for both. On a text page that image runs roughly
+1,300 to 4,800 tokens while the same page as plain text is 400 to 800. For a
+text-dominant document that is several times the cost for nothing extra.
+LocalContextRouter does the cheap work on your machine first and tells you what
+each page actually needs.
 
-- **Text-layer PDF** → extract text locally (near-free).
-- **Scanned / image-only page** → OCR on-device with Apple Vision.
-- **Chart / table / diagram / layout-heavy page** → keep the page as an image for the vision model, where the pixels actually carry meaning.
+It does not call a model. It returns a per-page decision and the text to send;
+your application still makes the call.
 
-It never calls an LLM itself. It prepares the cheapest faithful context and hands you
-back a routing decision plus a token-savings estimate. Your application still owns the
-model call.
+## How it decides
 
-## Why
+For each page:
 
-Multimodal models read a PDF by extracting its text *and* rendering every page to an
-image, then billing for both. A text-heavy page sent as an image can cost
-**1,300–4,800 tokens**; the same page as extracted text costs **400–800**. For
-text-dominant documents that is a 2–10× tax for zero added signal.
+- A usable text layer that is mostly prose: use the extracted text.
+- A text layer dominated by a table, chart, or diagram: send the page as an
+  image, where the layout carries the meaning.
+- No usable text, such as a scan or a photo: recognize it on-device with
+  Apple's Vision framework.
 
-LocalContextRouter spends cheap local compute to avoid that tax — and only escalates
-to vision when the page genuinely needs it.
+The result also reports how many tokens you saved against sending every page as
+an image.
 
 ## Install
 
 ```sh
 pip install localcontextrouter
 ```
 
-The macOS wheel bundles the on-device OCR binary (`lcr-ocr`, a universal2 build),
-so OCR works out of the box — no extra setup. To override it (e.g. a locally built
-binary), set `LCR_OCR_BIN` to its path.
+macOS only. The wheel bundles a universal (Apple Silicon and Intel) OCR binary,
+so text recognition works with no extra setup.
 
-## Use
+## Command line
 
-There is no server and no background process — everything runs on demand and exits.
+```sh
+localctx invoice.pdf
+localctx invoice.pdf --json
+localctx scan.png
+```
 
-### Command line
+`localctx invoice.pdf` prints each page, the source chosen for it, and the
+tokens saved:
 
-```sh
-localctx report.pdf                       # human summary + tokens saved
-localctx report.pdf --json                # machine-readable
-localctx report.pdf --vision-dir ./out    # render visual pages to ./out
+```
+Document: invoice.pdf (3 pages)
+Tokens saved vs sending every page as an image: 3085
+
+Page 1 [text]
+ACME Corp, Invoice #4471 ...
+
+Page 2 [vision]
+Quarterly results by segment ...
+
+Page 3 [ocr]
+SCANNED RECEIPT TOTAL 42.00
 ```
 
-### Library
+Add `--vision-dir DIR` to render the pages that should go to the model as images
+into `DIR`; their paths are then listed in the output and the JSON.
+
+## In code
 
 ```python
 from localcontextrouter import route_pdf, Source
 
-result = route_pdf("report.pdf")
+result = route_pdf("invoice.pdf")
 for page in result.pages:
     if page.source is Source.VISION:
-        ...  # send the rendered page image to the model
+        send_image(page.index)     # the page's meaning is visual
     else:
-        ...  # use page.text (extracted or OCR'd)
+        send_text(page.text)       # extracted or recognized text
 
-print(result.text)          # all text-routable pages joined
-print(result.tokens_saved)  # tokens avoided vs sending every page as an image
+print(result.tokens_saved)
 ```
 
-### Agent Skill
+Every page also carries an estimate of its cost both ways, as
+`page.tokens.text_tokens` and `page.tokens.image_tokens`.
+
+## As an agent skill
+
+`local-context-router` is an Agent Skill in the open `SKILL.md` format, so it
+works in Claude Code and other compatible agents. It lives in this repository
+under `.claude/skills/local-context-router`; copy that folder into your agent's
+skills directory:
+
+```sh
+cp -r .claude/skills/local-context-router ~/.claude/skills/
+```
 
-The `local-context-router` skill (in `.claude/skills/`) runs the same preflight
-inside Claude Code or Codex — copy it into your `.claude/skills/` (or `~/.claude/skills/`).
+With the package installed, the agent runs the preflight on any PDF or image you
+share, then uses the text for the cheap pages and attaches images only for the
+visual ones.
 
-## Requirements
+## Requirements and scope
 
-- macOS 10.15+ (on-device OCR uses the Apple Vision framework)
-- Python 3.10+
+- macOS 11 or newer. Recognition uses the Apple Vision framework and needs a
+  normal macOS graphics environment; it will not run inside a headless sandbox
+  that lacks one.
+- Python 3.10 or newer.
+- The scope is per-page routing, on-device OCR, and a token estimate. Retrieval
+  over very large documents is out of scope.
 
 ## License
 
-[MIT](LICENSE) © 2026 Siddharth Nashikkar
+MIT. See [LICENSE](LICENSE).
diff --git a/ocr/README.md b/ocr/README.md
@@ -1,7 +1,7 @@
 # lcr-ocr
 
 On-device OCR binary used by LocalContextRouter. Wraps the Apple Vision
-framework — fully offline, no network, no entitlements, and no Screen Recording
+framework, fully offline, no network, no entitlements, and no Screen Recording
 permission (it reads image files you pass in, it does not capture the screen).
 
 ## Build
@@ -59,9 +59,9 @@ Follows the `sysexits.h` convention so callers can branch on failure mode:
 
 ## Layout
 
-- `Sources/LCROCR` — reusable library: image loading, the Vision engine, and the result models.
-- `Sources/lcr-ocr` — thin CLI over the library.
-- `Tests/LCROCRTests` — engine tests that render text in-process (no binary fixtures).
+- `Sources/LCROCR`, reusable library: image loading, the Vision engine, and the result models.
+- `Sources/lcr-ocr`, thin CLI over the library.
+- `Tests/LCROCRTests`, engine tests that render text in-process (no binary fixtures).
 
 ## Requirements
 

diff --git a/ocr/Sources/LCROCR/ImageLoading.swift b/ocr/Sources/LCROCR/ImageLoading.swift
@@ -20,7 +20,7 @@ public enum ImageLoadError: Error, CustomStringConvertible {
 /// Loads bitmaps from disk into `CGImage` using ImageIO.
 ///
 /// ImageIO is used instead of AppKit so the binary runs headless (no window
-/// server) — important for CI and for invocation from a CLI.
+/// server), important for CI and for invocation from a CLI.
 public enum ImageLoader {
     /// Decode the first image in the file at `path`.
     public static func loadCGImage(atPath path: String) throws -> CGImage {

diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 
 [project]
 name = "localcontextrouter"
-description = "Preflight router that picks the cheapest faithful path — text, OCR, or vision — before a document reaches a multimodal LLM."
+description = "Preflight router that picks the cheapest faithful path (text, OCR, or vision) before a document reaches a multimodal LLM."
 readme = "README.md"
 requires-python = ">=3.10"
 license = "MIT"

diff --git a/src/localcontextrouter/__init__.py b/src/localcontextrouter/__init__.py
@@ -1,4 +1,4 @@
-"""LocalContextRouter — cheapest faithful path for documents bound for a multimodal LLM."""
+"""LocalContextRouter, cheapest faithful path for documents bound for a multimodal LLM."""
 
 from .classify import classify_text, compute_signals
 from .detect import is_vision_worthy

diff --git a/src/localcontextrouter/classify.py b/src/localcontextrouter/classify.py
@@ -5,7 +5,7 @@
 absent (:class:`PageClass.SCANNED`), or present but broken
 (:class:`PageClass.GARBLED`). The two latter cases route to OCR downstream.
 
-Thresholds are deliberately conservative — when in doubt the page is sent to
+Thresholds are deliberately conservative, when in doubt the page is sent to
 OCR, since a wrong "digital" verdict silently feeds garbage to the model.
 """
 

diff --git a/src/localcontextrouter/cli.py b/src/localcontextrouter/cli.py
@@ -1,4 +1,4 @@
-"""``localctx`` — route a document and report the cheapest faithful source per page."""
+"""``localctx``, route a document and report the cheapest faithful source per page."""
 
 from __future__ import annotations
 

diff --git a/src/localcontextrouter/detect.py b/src/localcontextrouter/detect.py
@@ -3,7 +3,7 @@
 Some pages carry a perfectly good text layer yet still lose their meaning when
 flattened to text: tables, charts, diagrams, and figure-heavy layouts. Those are
 worth the vision-token cost. This module decides that from cheap layout features
-(:class:`~.models.PageFeatures`) — no rendering and no ML.
+(:class:`~.models.PageFeatures`), no rendering and no ML.
 """
 
 from __future__ import annotations

diff --git a/src/localcontextrouter/models.py b/src/localcontextrouter/models.py
@@ -10,13 +10,13 @@ class PageClass(str, Enum):
     """How a PDF page should be sourced before it reaches an LLM."""
 
     DIGITAL = "digital"
-    """A usable embedded text layer is present — extract the text directly."""
+    """A usable embedded text layer is present, extract the text directly."""
 
     SCANNED = "scanned"
-    """Little or no text layer — the page is image-only and needs OCR."""
+    """Little or no text layer, the page is image-only and needs OCR."""
 
     GARBLED = "garbled"
-    """A text layer exists but is broken (unmapped glyphs) — OCR is safer."""
+    """A text layer exists but is broken (unmapped glyphs), OCR is safer."""
 
 
 @dataclass(frozen=True)
@@ -74,7 +74,7 @@ class Source(str, Enum):
     """Produced by on-device OCR after rendering the page."""
 
     VISION = "vision"
-    """Send the page to a vision model — its meaning lives in the visuals."""
+    """Send the page to a vision model, its meaning lives in the visuals."""
 
 
 @dataclass(frozen=True)

diff --git a/src/localcontextrouter/ocr.py b/src/localcontextrouter/ocr.py
@@ -120,7 +120,7 @@ def ocr_png_text(
 ) -> str:
     """OCR a PNG given as bytes; return the recognized lines joined by newlines.
 
-    Lines below ``min_confidence`` are dropped — useful for filtering the
+    Lines below ``min_confidence`` are dropped, useful for filtering the
     low-confidence glyphs that icons and logos tend to produce.
     """
     with tempfile.NamedTemporaryFile(suffix=".png") as tmp:

diff --git a/src/localcontextrouter/router.py b/src/localcontextrouter/router.py
@@ -1,7 +1,7 @@
 """Route each PDF page to the cheapest faithful source: text, OCR, or vision.
 
 - Digital pages keep their extracted text, unless their meaning lives in visuals
-  (tables, charts, diagrams) — those go to a vision model.
+  (tables, charts, diagrams), those go to a vision model.
 - Scanned or garbled pages are rendered and sent to OCR.
 
 Every page carries a token estimate so the savings of avoiding the image path

diff --git a/src/localcontextrouter/text.py b/src/localcontextrouter/text.py
@@ -1,6 +1,6 @@
 """Text normalization for routed output.
 
-Applied to the text a page contributes to the model — not before
+Applied to the text a page contributes to the model, not before
 classification, which relies on seeing control and replacement characters to
 spot a broken text layer.
 """