Skip to content

fix: scope field extraction to target class to prevent cross-class injection#1953

Open
mashraf-222 wants to merge 7 commits intomainfrom
cf-1087-field-injection-class-filter
Open

fix: scope field extraction to target class to prevent cross-class injection#1953
mashraf-222 wants to merge 7 commits intomainfrom
cf-1087-field-injection-class-filter

Conversation

@mashraf-222
Copy link
Copy Markdown
Contributor

Problem

When the LLM generates optimization code containing inner/anonymous classes with fields, those fields are incorrectly injected into the target method's enclosing class. This produces uncompilable code — e.g., type parameters like Iterator<? extends F> injected where F is not in scope.

Observed in Guava PR #12 (Iterables.mergeSorted) where inner class fields were injected into the outer Iterables class.

Root Cause

_parse_optimization_source() in replacement.py (line 72) called analyzer.find_fields(new_source) without a class_name filter, extracting ALL fields from ALL classes in the generated code — including inner classes, static nested classes, and anonymous classes.

Fix

Pass class_name=target_method.class_name to find_fields() so only fields belonging to the target method's class are extracted. The find_fields() API already supported this filter via its _walk_tree_for_fields() implementation — it just wasn't being used.

The field extraction was also moved after the target method lookup (previously it ran before we knew which class the target was in).

Validation

  • Bug reproduced: test creates an inner class with badField → it was injected into outer class
  • Fix confirmed: badField no longer injected, only OFFSET (same class as target) is added
  • All 33 replacement tests pass (32 existing + 1 new)

Test Coverage

  • tests/test_languages/test_java/test_replacement.pyTestFieldInjectionClassFiltering::test_inner_class_fields_not_injected_into_outer

Closes CF-1087

…jection

find_fields() was called without a class_name filter, causing fields from
inner/anonymous classes to be injected into the outer target class. Now
scoped to target_method.class_name using the existing filter parameter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 1, 2026

Claude encountered an error —— View job


I'll analyze this and get back to you.

Add -> None return annotations and Path / JavaSupport parameter annotations
to every test method + fixture so the prek mypy hook passes when the file
is in the CI diff.
Rich renders the banner panel with box-drawing characters (╭, ╮, │, etc.)
that cp1252 cannot decode. On Windows, subprocess.run(..., text=True) uses
cp1252 by default, so decoding the child stdout raises UnicodeDecodeError
and subprocess sets result.stdout to None — breaking the assertion with a
misleading "argument of type 'NoneType' is not iterable".

Pass encoding="utf-8" explicitly so the test passes on every platform.
@mashraf-222
Copy link
Copy Markdown
Contributor Author

Review

Bug premise verified — real. _parse_optimization_source in codeflash/languages/java/replacement.py:72 on main calls analyzer.find_fields(new_source) without a class_name filter, so fields declared inside nested / anonymous / sibling classes in the LLM-generated optimization leak into new_fields_to_add of the outer target class, producing uncompilable code (the Guava Iterables.mergeSorted case: Iterator<? extends F> referencing an out-of-scope F).

Fix is minimal and correct. find_fields(new_source, class_name=target_class_name) uses the already-supported class filter — symmetric with find_constructors(..., class_name=...) at replacement.py:124 and find_fields(result, class_name=class_name) at replacement.py:242, both of which already filter. No new abstractions, no workaround.

Correctness edge cases: top-level siblings, static-nested, and inner classes are handled correctly by parser.py:615's existing current_class tracking. Anonymous inner classes keep fields attributed to the enclosing class (same as before — this PR doesn't change that), inherited fields from parent classes in other files are unaffected. One follow-up worth a small TODO: _walk_tree_for_fields at parser.py:615 only sets current_class for class_declaration — not enum_declaration / interface_declaration / record_declaration — so a target method inside an enum with fields would drop those fields after this change (vs. previously keeping all fields from the file). Low-probability regression; not blocking this PR.

CI blockers addressed in the last two commits:

  1. prek / prek was failing on pre-existing untyped tests in test_replacement.py that the prek mypy hook surfaced because this PR touches the file. Added -> None and missing Path / JavaSupport parameter annotations on all 33 test methods + the java_support fixture.
  2. unit-tests (windows-latest, 3.13)test_help_banner.py was failing on main (Rich box-drawing characters undecodable under cp1252). Added encoding="utf-8" to both subprocess.run calls — same fix as PR fix: decode help-banner test subprocess output as UTF-8 #2120.

e2e-java java-fibonacci-nogit CF-API 500 at test generation is an infra flake unrelated to this PR. Ready for re-review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant