feat: richer symbol metadata — signatures, JSDoc, type members, call graph#19
feat: richer symbol metadata — signatures, JSDoc, type members, call graph#19SutuSebastian merged 9 commits intomainfrom
Conversation
Signatures now include generic type parameters (`<T extends Base>`), return type annotations (`: Promise<void>`), and class/interface heritage (`extends`, `implements`). Enum members are extracted into a new `members TEXT` column as JSON. Benchmarked on a 1,653-file React/TS codebase: 973 functions gain return types, 873 symbols gain generics, 291 enum member values captured across 50 enums. Index time +7% (negligible), DB size unchanged.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 6 minutes and 3 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds richer symbol metadata and call/type-member extraction: schema version bumped to 2 with new Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as Client
participant Parser as Parser
participant Adapter as Adapter
participant Indexer as IndexEngine
participant DB as SQLite
CLI->>Parser: parse file
Parser-->>Adapter: ExtractedData {symbols, typeMembers, calls, ...}
Adapter->>Indexer: submit ParsedFilePayload (includes typeMembers,calls)
Indexer->>DB: insertSymbols / insertTypeMembers / insertCalls
DB-->>Indexer: OK
Indexer-->>CLI: indexing result / stats (includes type_members, calls)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🦋 Changeset detectedLatest commit: 5d5d725 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Add doc_comment column to symbols (3,084 documented symbols on benchmark repo) and new type_members table for interface/type-alias properties (12,052 members extracted). Agents can now query type shapes and symbol documentation without reading files. SCHEMA_VERSION bumped to 2.
Captures string, number, boolean, null, negative numbers, `as const`, and simple template literals. 615 values extracted on benchmark repo.
Adds parent_name column to symbols for scope tracking. A visitor stack assigns parent context to nested functions/consts (10,202 nested symbols on benchmark). Class methods, properties, and getters/setters are now extracted as individual symbols with parent_name pointing to the class.
New `calls` table tracks which functions call which. Edges are deduped per file and limited to function-scoped calls (module-level excluded). Benchmark (merchant-dashboard-v2): 13,804 edges, 2,700 callers, 5,310 callees. Index time unchanged, DB +5MB (23→28MB).
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/parser.ts (1)
163-197:⚠️ Potential issue | 🔴 CriticalFix scope tracking in multi-declarator variable declarations.
The
VariableDeclarationhandler pushes all function-valued declarators ontoscopeStackbefore their initializers are visited, butVariableDeclaration:exitonly pops those whose names match the current top of the stack. Forconst a = () => {}, b = () => {}, both are pushed during enter, then onlybis popped during exit (since it's the top), leavingastranded. This corruptscurrentParent()for all subsequent symbols, affectingparent_name, call edges, and component/hook attribution for the rest of the file.Current code (lines 204–219)
"VariableDeclaration:exit"(node: any) { for (const decl of node.declarations) { const name = decl.id?.name; if (!name) continue; const init = decl.init; const isArrowOrFn = init?.type === "ArrowFunctionExpression" || init?.type === "FunctionExpression"; if (isArrowOrFn && scopeStack[scopeStack.length - 1] === name) { scopeStack.pop(); }Track pushed declarators individually or pop them in reverse order during exit to ensure all pushed names are removed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/parser.ts` around lines 163 - 197, The VariableDeclaration handler is pushing multiple function-valued declarator names onto scopeStack but VariableDeclaration:exit only pops when the top matches a single name, leaving earlier pushed names stranded; update the exit handler (the "VariableDeclaration:exit" block) to iterate the node.declarations in reverse order and pop from scopeStack for each declarator whose name matches the current top until all pushed declarators are removed (or use a per-declaration pushedNames list stored during enter and pop them in reverse), ensuring scopeStack and currentParent() are correctly restored for subsequent symbols.
🧹 Nitpick comments (2)
docs/architecture.md (1)
176-190: Document the new class-memberkindvalues.
src/parser.tsnow emitsmethod,property,getter, andsetterrows for class members, but this schema table still documents only the original six values. Anyone using this page as the schema contract will write filters that miss those symbols.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/architecture.md` around lines 176 - 190, Update the schema docs to include the new class-member kind values emitted by src/parser.ts: add "method", "property", "getter", and "setter" to the Kind column description so the table lists `function`, `const`, `class`, `interface`, `type`, `enum`, `method`, `property`, `getter`, `setter`; ensure the description clarifies these are used for class members (e.g., "method, property, getter, setter — class member symbols") so consumers can write correct filters for class members emitted by the parser.templates/agents/skills/codemap/SKILL.md (1)
218-220: Keep the shipped template examples aligned with the authoring skill.This copy only adds the enum-members lookup, while
.agents/skills/codemap/SKILL.mdnow also documentstype_members,doc_comment,value,parent_name, andcallsqueries. Becausecodemap agents initinstalls this file, published users won't see most of the new query surface.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@templates/agents/skills/codemap/SKILL.md` around lines 218 - 220, Update the shipped template in SKILL.md so it matches the authoring skill by adding the missing query examples and fields: extend the enum example (the SELECT from symbols where kind='enum' and name='TransactionStatus') to also show type_members, doc_comment, value, parent_name and calls columns; include example results or notes for each of those fields and a brief example query showing how to select type_members and doc_comment (and value/parent_name/calls) from the symbols table so the installed template exposes the same query surface as the authoring skill.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/db.ts`:
- Around line 125-130: The calls table currently stores only caller_name and
conflates same-named callers in a file; add a persistent qualified caller
identity (e.g., caller_key or caller_scope which encodes scope path or
parent+line metadata) and replace or augment caller_name with this column, then
enforce uniqueness/indexing on (file_path, caller_key) and use (file_path,
caller_key, callee_name) for deduping edges; update any creation/insert logic
that writes to calls (and the analogous schema at the other occurrence) to
populate caller_key and to dedupe/idx based on the new qualified key rather than
bare caller_name.
---
Outside diff comments:
In `@src/parser.ts`:
- Around line 163-197: The VariableDeclaration handler is pushing multiple
function-valued declarator names onto scopeStack but VariableDeclaration:exit
only pops when the top matches a single name, leaving earlier pushed names
stranded; update the exit handler (the "VariableDeclaration:exit" block) to
iterate the node.declarations in reverse order and pop from scopeStack for each
declarator whose name matches the current top until all pushed declarators are
removed (or use a per-declaration pushedNames list stored during enter and pop
them in reverse), ensuring scopeStack and currentParent() are correctly restored
for subsequent symbols.
---
Nitpick comments:
In `@docs/architecture.md`:
- Around line 176-190: Update the schema docs to include the new class-member
kind values emitted by src/parser.ts: add "method", "property", "getter", and
"setter" to the Kind column description so the table lists `function`, `const`,
`class`, `interface`, `type`, `enum`, `method`, `property`, `getter`, `setter`;
ensure the description clarifies these are used for class members (e.g.,
"method, property, getter, setter — class member symbols") so consumers can
write correct filters for class members emitted by the parser.
In `@templates/agents/skills/codemap/SKILL.md`:
- Around line 218-220: Update the shipped template in SKILL.md so it matches the
authoring skill by adding the missing query examples and fields: extend the enum
example (the SELECT from symbols where kind='enum' and name='TransactionStatus')
to also show type_members, doc_comment, value, parent_name and calls columns;
include example results or notes for each of those fields and a brief example
query showing how to select type_members and doc_comment (and
value/parent_name/calls) from the symbols table so the installed template
exposes the same query surface as the authoring skill.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 47a97353-916b-4db4-9665-2890eb797feb
📒 Files selected for processing (14)
.agents/rules/codemap.mdc.agents/skills/codemap/SKILL.md.changeset/richer-symbol-metadata.mddocs/architecture.mdsrc/adapters/builtin.tssrc/adapters/types.tssrc/application/index-engine.tssrc/application/run-index.tssrc/application/types.tssrc/db.tssrc/parsed-types.tssrc/parser.test.tssrc/parser.tstemplates/agents/skills/codemap/SKILL.md
…declarator scope fix
- Add caller_scope column to calls table (dot-joined scope path, e.g.
UserService.run) so same-named methods across classes are no longer
conflated. Dedup key now uses full scope path.
- Handle this.foo() calls in CallExpression (emitted as this.methodName).
- Fix multi-declarator scope corruption: VariableDeclaration:exit now
iterates in reverse so const a = () => {}, b = () => {} correctly
pops both scopes.
- Document new class-member kind values (method, property, getter,
setter) in architecture.md.
- Sync template SKILL.md with authoring skill query examples.
|
@coderabbitai Re: the outside-diff Critical finding on |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 2801
🧠 Learnings usedIf you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/db.ts (1)
8-12:⚠️ Potential issue | 🟡 MinorUpdate the schema-version policy comment to match current behavior.
The header comment still says to keep pre-release schema at
1, but Line 12 now uses2. This can cause incorrect future versioning decisions.Suggested doc fix
- * Pre-release: keep at **1** until the first npm release — do not bump for DDL - * tweaks; run `--full` locally after pulling. After v1.0, bump in lockstep with - * `createTables` / `createIndexes` when the on-disk schema changes. + * Bump in lockstep with `createTables` / `createIndexes` whenever on-disk schema + * changes. `createSchema()` rebuilds automatically on version mismatch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/db.ts` around lines 8 - 12, The header comment describing pre-release schema versioning is out of sync with the exported constant SCHEMA_VERSION (now 2); update the comment above SCHEMA_VERSION to reflect the current policy (e.g., “Pre-release: keep at 2 until the first npm release” or change the wording to avoid a hardcoded number and reference SCHEMA_VERSION) so future contributors won't be misled; locate the comment immediately above the exported constant SCHEMA_VERSION and adjust the wording accordingly.
🧹 Nitpick comments (2)
src/db.ts (1)
188-190: Add acaller_scopeindex for disambiguated call-graph lookups.
callsnow disambiguates bycaller_scope, but current indexes only covercaller_name,callee_name, andfile_path. Queries keyed by scope will degrade as the table grows.Suggested index addition
CREATE INDEX IF NOT EXISTS idx_calls_caller ON calls(caller_name, file_path); + CREATE INDEX IF NOT EXISTS idx_calls_scope ON calls(caller_scope, file_path, callee_name); CREATE INDEX IF NOT EXISTS idx_calls_callee ON calls(callee_name, file_path); CREATE INDEX IF NOT EXISTS idx_calls_file ON calls(file_path);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/db.ts` around lines 188 - 190, Add an index on the calls.caller_scope column to support disambiguated call-graph lookups; update the SQL DDL where existing indexes are created (see existing CREATE INDEX IF NOT EXISTS idx_calls_caller, idx_calls_callee, idx_calls_file) by adding a new statement such as creating idx_calls_caller_scope on calls(caller_scope) (or a multi-column index like idx_calls_caller_scope ON calls(caller_scope, caller_name, file_path) if queries filter by scope plus name/file) so that queries using caller_scope are covered and remain performant as the calls table grows.docs/architecture.md (1)
314-316: Expand the parser extraction checklist for completeness.This section now covers signatures and enum members, but it still omits other newly extracted outputs (
doc_comment,value,parent_name, plus call/type-member extraction). Adding them here would keep the parser checklist self-contained.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/architecture.md` around lines 314 - 316, Update the parser extraction checklist paragraph (the bullets listing "Symbols", "Enum members", "Imports") to also enumerate the newly extracted outputs: include "doc_comment" (docblock/JSDoc associated with symbols), "value" (literal/initializer values for variables/enum members), "parent_name" (the enclosing class/module name for nested members), and "calls/type-members" (extracted call expressions and type member references). Amend the bullets so each item mirrors the existing format (give example shapes like JSON or type annotations) and reference the existing entries "Functions", "Enum members", and "Imports" to keep the checklist self-contained and consistent with the current phrasing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/db.ts`:
- Around line 8-12: The header comment describing pre-release schema versioning
is out of sync with the exported constant SCHEMA_VERSION (now 2); update the
comment above SCHEMA_VERSION to reflect the current policy (e.g., “Pre-release:
keep at 2 until the first npm release” or change the wording to avoid a
hardcoded number and reference SCHEMA_VERSION) so future contributors won't be
misled; locate the comment immediately above the exported constant
SCHEMA_VERSION and adjust the wording accordingly.
---
Nitpick comments:
In `@docs/architecture.md`:
- Around line 314-316: Update the parser extraction checklist paragraph (the
bullets listing "Symbols", "Enum members", "Imports") to also enumerate the
newly extracted outputs: include "doc_comment" (docblock/JSDoc associated with
symbols), "value" (literal/initializer values for variables/enum members),
"parent_name" (the enclosing class/module name for nested members), and
"calls/type-members" (extracted call expressions and type member references).
Amend the bullets so each item mirrors the existing format (give example shapes
like JSON or type annotations) and reference the existing entries "Functions",
"Enum members", and "Imports" to keep the checklist self-contained and
consistent with the current phrasing.
In `@src/db.ts`:
- Around line 188-190: Add an index on the calls.caller_scope column to support
disambiguated call-graph lookups; update the SQL DDL where existing indexes are
created (see existing CREATE INDEX IF NOT EXISTS idx_calls_caller,
idx_calls_callee, idx_calls_file) by adding a new statement such as creating
idx_calls_caller_scope on calls(caller_scope) (or a multi-column index like
idx_calls_caller_scope ON calls(caller_scope, caller_name, file_path) if queries
filter by scope plus name/file) so that queries using caller_scope are covered
and remain performant as the calls table grows.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 15500306-e7e9-434e-b5be-36423651f5a1
📒 Files selected for processing (6)
.agents/skills/codemap/SKILL.mddocs/architecture.mdsrc/db.tssrc/parser.test.tssrc/parser.tstemplates/agents/skills/codemap/SKILL.md
✅ Files skipped from review due to trivial changes (2)
- src/parser.test.ts
- templates/agents/skills/codemap/SKILL.md
🚧 Files skipped from review as they are similar to previous changes (2)
- .agents/skills/codemap/SKILL.md
- src/parser.ts
- Cache scopeStack.join(".") via scopePush/scopePop helpers
- Hoist hot-path regex literals (RE_COMPONENT, RE_HOOK) to module scope
- Replace findJsDoc gap regex with charCodeAt loop (no .slice())
- Cache getProjectRoot() in getChangedFiles, insertParsedResults, indexFiles
- Eliminate redundant getAllFileHashes call in incremental path
- Batch DELETE for deleted files (single IN() vs N separate DELETEs)
- Hoist fileURLToPath and getProjectRoot() in worker-pool spawn loop
- Increase BATCH_SIZE 100→500 for fewer INSERT round-trips
- Zero-alloc isPathExcluded via charCodeAt segment scan
- Update docs: schema version 1→2, batch size references
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/db.ts (1)
185-190: Consider adding a composite unique index or constraint oncallsfor deduplication.The documentation (line 202 in architecture.md) states edges are deduped per
(caller_scope, callee)per file. While the parser may handle deduplication before insert, the schema doesn't enforce uniqueness. A unique index would prevent duplicate rows if the same data is inserted twice (e.g., during a retry or concurrent indexing scenario).🔧 Optional: Add unique constraint for data integrity
- CREATE INDEX IF NOT EXISTS idx_calls_caller ON calls(caller_name, file_path); + CREATE UNIQUE INDEX IF NOT EXISTS idx_calls_unique ON calls(file_path, caller_scope, callee_name); + CREATE INDEX IF NOT EXISTS idx_calls_caller ON calls(caller_name, file_path);Then update
insertCallsto useINSERT OR IGNOREorINSERT OR REPLACE:- "INSERT INTO calls (file_path, caller_name, caller_scope, callee_name)", + "INSERT OR IGNORE INTO calls (file_path, caller_name, caller_scope, callee_name)",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/db.ts` around lines 185 - 190, Add a schema-level unique constraint/index to enforce deduplication of call edges (e.g., CREATE UNIQUE INDEX IF NOT EXISTS idx_calls_unique ON calls(caller_scope, callee_name, file_path)) so the DB prevents duplicate rows for the same (caller_scope, callee) per file, and update the insert path (the insertCalls function) to use INSERT OR IGNORE or INSERT OR REPLACE to avoid errors on conflicts; include this change in the migration/initialization SQL so existing DBs get the new constraint and handle potential conflicts during inserts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/db.ts`:
- Around line 185-190: Add a schema-level unique constraint/index to enforce
deduplication of call edges (e.g., CREATE UNIQUE INDEX IF NOT EXISTS
idx_calls_unique ON calls(caller_scope, callee_name, file_path)) so the DB
prevents duplicate rows for the same (caller_scope, callee) per file, and update
the insert path (the insertCalls function) to use INSERT OR IGNORE or INSERT OR
REPLACE to avoid errors on conflicts; include this change in the
migration/initialization SQL so existing DBs get the new constraint and handle
potential conflicts during inserts.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 788f6cb4-d187-4f84-a123-06e8743de9d5
📒 Files selected for processing (8)
.changeset/richer-symbol-metadata.mddocs/architecture.mdsrc/application/index-engine.tssrc/application/run-index.tssrc/db.tssrc/parser.tssrc/runtime.tssrc/worker-pool.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- .changeset/richer-symbol-metadata.md
- src/application/run-index.ts
- src/application/index-engine.ts
- src/parser.ts
… checklist - Update SCHEMA_VERSION comment to reflect current bump-on-change policy - Add idx_calls_scope covering index for caller_scope queries - Expand parser extraction checklist in architecture.md with JSDoc, const values, type members, call graph, and symbol nesting
Summary
Enriches the structural index with significantly more queryable data, eliminating the need for agents to read source files for common structural questions.
members TEXTcolumn on symbols with JSON array of{name, value}entriesdoc_comment TEXTcolumn on symbols with cleaned comment text (preserves@deprecated,@param, etc.)type_memberstable indexing properties and methods of interfaces and object-literal type aliases (name, type, optional, readonly)value TEXTcolumn on symbols for string/number/boolean/null literals,as const, simple template literalsparent_name TEXTcolumn on symbols tracking scope (class methods → class name, nested functions → parent function). Class methods, properties, and getters extracted as individual symbolscallstable with function-scoped, per-file-deduped call edges (caller → callee). Includescaller_scopefor qualified disambiguation (e.g.UserService.runvsOrderService.run). Capturesthis.method()calls. Enables fan-in/fan-out, "who calls X?", dead code detectioncharCodeAtJSDoc gap scan, cachedgetProjectRoot()in loops, eliminated redundantgetAllFileHashescall in incremental path, batch DELETE for deleted files, hoistedfileURLToPath/getProjectRoot()in worker-pool,BATCH_SIZE100→500Schema changes
symbols: addedmembers,doc_comment,value,parent_namecolumnstype_members(file_path, symbol_name, name, type, is_optional, is_readonly)calls(file_path, caller_name, caller_scope, callee_name)SCHEMA_VERSIONbumped to 2Benchmarks
merchant-dashboard-v2 (1,653 files, 24K symbols, 14K calls):
Test plan
Summary by CodeRabbit
New Features
Performance