fix: cap wildcard import expansion to avoid token explosion#1951
fix: cap wildcard import expansion to avoid token explosion#1951mashraf-222 wants to merge 6 commits intomainfrom
Conversation
…ute stalls Wildcard imports like `import org.jooq.*` expand to 870+ types, causing 5 minutes of disk I/O per function before the token budget check kicks in. 89% of jOOQ functions were skipped due to this. When a wildcard expands to >50 types, filter to only types referenced in the target method's code. This turns a 5-minute failure into a <1 second resolution with only the relevant types included. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
The prek mypy hook runs on changed files and bypasses the pyproject.toml tests/ exclude, surfacing pre-existing errors in both context.py and test_context.py that block CI for this PR. Fixes applied: - Import Language from language_enum instead of base (base re-exports are not explicit; strict mypy flags attr-defined) - Annotate _extract_class_declaration, _import_to_statement, get_java_imported_type_skeletons, and resolved_imports - Guard None start/end_line in _extract_function_source_by_lines and find_helper_functions; guard None file_path in the import skeleton loop - Drop unreachable `if not node: continue` in _extract_public_method_signatures (JavaMethodNode.node is non-nullable) - Add -> None to every test method and fix an `int | None` comparison in test_context.py All 880 Java tests pass after the change.
Rich renders the banner panel with box-drawing characters (╭, ╮, │, etc.) that cp1252 cannot decode. On Windows, subprocess.run(..., text=True) uses cp1252 by default, so decoding the child stdout raises UnicodeDecodeError and subprocess sets result.stdout to None — breaking the assertion with a misleading "argument of type 'NoneType' is not iterable". Pass encoding="utf-8" explicitly so the test passes on every platform.
ReviewBug premise verified — real. Fix is well-scoped. Two-phase expansion (count-capped probe → filter-by-referenced-types OR truncate to first 50) matches the neighboring CI blockers addressed in the last two commits:
Non-blocking gaps worth following up on: no direct unit tests for Ready for re-review. |
Problem
Wildcard imports like
import org.jooq.*expand to 870+ types, causing 5 minutes of disk I/O per function before discovering the 4000-token skeleton budget is exceeded. In jOOQ, 89% of functions (70/79) were skipped due to token overflow from wildcard imports.The
expand_wildcard_import()function globs all.javafiles in the package directory unconditionally, and the token budget check inget_java_imported_type_skeletons()only fires after reading each file and parsing its skeleton — by which point hundreds of files have already been read from disk.Root Cause
context.py:933-940: Wildcard expansion happens without any count limit or early bailout.import_resolver.py:223-252:expand_wildcard_import()returns all types unconditionally.Fix
import_resolver.pymax_typesparameter toexpand_wildcard_import()for early terminationfilter_namesparameter to only include types matching a given setcontext.pyMAX_WILDCARD_TYPES_UNFILTERED = 50constantfilter_names=priority_types(only referenced types)This turns a 5-minute failure into <1 second resolution with only the relevant types included.
Test Coverage
New test
test_large_wildcard_is_filtered_to_referenced_types:All 4 existing edge case tests pass unchanged.
Closes CF-1085
🤖 Generated with Claude Code