Skip to content

refactor(ai-explainer): simplify per review#229

Merged
spalen0 merged 2 commits into
mainfrom
refactor-ai-explainer-simplify
May 20, 2026
Merged

refactor(ai-explainer): simplify per review#229
spalen0 merged 2 commits into
mainfrom
refactor-ai-explainer-simplify

Conversation

@spalen0
Copy link
Copy Markdown
Collaborator

@spalen0 spalen0 commented May 20, 2026

Three internal-only cleanups from a code review of the AI/LLM modules. No behavior change; 239/239 tests still pass.

Changes

1. `_strip_solidity_noise` — single-pass `re.sub`

Was three sequential `re.finditer` loops each iterating character-by-character to overwrite a `list[chr]`. For a 500KB multi-file source bundle that's ~3n character writes plus the cost of joining the list back between passes. Replaced with one regex (alternation across the four noise patterns) and a callback that emits a same-length whitespace replacement.

2. `_storage_layout` — flatten 5-level nesting

The gap-consumption branch had `if consumed > 0 → if old_gap → if expected_new_gap < 0 / == 0 / > 0 → if new_gap...`. Extracted to two helpers:

  • `_check_gap_consumption(consumed, old_gap, new_gap) -> list[str]`
  • `_check_gap_only_change(old_gap, new_gap) -> list[str]`

Body is now a flat dispatch on `sign(consumed)`. Same behavior, easier to read and unit-test.

3. Promote leaky private imports

`on_chain_state.py` and `impl_diff.py` both imported `_fetch_source`, `_find_state_var_writes`, `_extract_state_var_snippet` from `source_context.py`. The underscore prefix signals module-private but two other production modules depend on them — de facto public API.

Renamed to drop the underscore. Updated 3 import sites + 2 test patch paths. The other private helpers (`_concat_sources`, `_extract_function_snippet`, `_extract_function_body`, `_build_context`) are still truly internal and keep their underscore.

Skipped findings

Noted but judged not worth the churn:

  • `ExplainerOptions` dataclass for the 8+ params on `explain_transaction` / `explain_batch_transaction` — caller-wide change for an ergonomic-only win.
  • Unifying `explain_transaction` vs `explain_batch_transaction` bodies — the single-call vs loop shape makes full unification ugly.
  • Risk-verdict enum — strings match the natural LLM-output shape.
  • Thread-pooled eth_calls — RPC rate-limit fragility; the per-process cache already absorbs the load.

Test plan

  • `uv run ruff check .` passes
  • `uv run ruff format --check .` passes
  • `uv run pytest tests/` — 239 pass (excluding the pre-existing telegram failure on main)

spalen0 and others added 2 commits May 20, 2026 15:17
Three internal-only cleanups from a code review pass over the AI/LLM modules.
No behavior change; 239/239 tests still pass.

1. utils/impl_diff._strip_solidity_noise — single-pass re.sub.

   Was: three sequential re.finditer loops, each iterating
   character-by-character to overwrite a list[chr]. For a 500KB multi-file
   source bundle that's ~3n character writes plus the cost of joining the
   list back to a string between passes.

   Now: one regex with alternation across the four noise patterns plus a
   re.sub callback that builds a same-length whitespace replacement. Single
   linear scan, no intermediate list allocations.

2. utils/impl_diff._storage_layout — flatten nesting.

   The gap-consumption branch reached 5 levels of nesting (if consumed > 0 →
   if old_gap → if expected_new_gap < 0 / == 0 / > 0 → inner if for new_gap
   shape). Extracted to two helpers:

     _check_gap_consumption(consumed, old_gap, new_gap) -> list[str]
     _check_gap_only_change(old_gap, new_gap) -> list[str]

   Body is now a flat dispatch on the sign of `consumed`. Same behavior,
   easier to read and individually testable.

3. Promote cross-module helpers from `_name` to `name` in source_context.py.

   Both on_chain_state.py and impl_diff.py imported three underscore-prefixed
   functions from source_context (`_fetch_source`, `_find_state_var_writes`,
   `_extract_state_var_snippet`). The underscore prefix signals
   module-private, but they were de facto public APIs because two other
   production modules depended on them.

   Renamed without underscore to match their actual usage. Updated three
   import sites + two test files that patched the old names. The other
   private helpers (_concat_sources, _extract_function_snippet,
   _extract_function_body, _build_context) are still truly internal and
   keep their underscore.

Skipped (noted but not worth the churn):
- ExplainerOptions dataclass for the 8+ params on explain_transaction /
  explain_batch_transaction — caller-wide change for an ergonomic-only win.
- Unifying explain_transaction vs explain_batch_transaction bodies — the
  single-call vs loop shape makes full unification ugly.
- Risk-verdict enum — strings match the natural LLM-output shape.
- Thread-pooled eth_calls — RPC rate-limit fragility, the per-process cache
  already absorbs most of the load.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes 16 tests (-212 lines) that mostly assert "did X string appear in the
formatter output." These overlap heavily with the integration paths and
break on any wording change without flagging real bugs.

Dropped:
- TestFormatDecodedCalls (3)        — covered by every _build_prompt test
- TestFormatSimulationContext (5)   — same pattern, format X → assertIn X
- TestSystemPromptBrevity (1)       — asserts specific SYSTEM_PROMPT strings;
                                      fragile against prompt iteration
- TestFormatImplDiff (3)            — formatter substring checks
- TestFormatSourceContext (2)       — same
- 2 duplicate refine cases:
  test_pass_with_trailing_whitespace_still_keeps_draft (subsumed into
  test_pass_keeps_draft_unchanged with a whitespace-padded PASS)
  test_revision_with_empty_summary_falls_back (identical setup to
  test_empty_response_falls_back_to_draft)

Kept (all regression guards for real past bugs):
- TestParseExplanation — each tests a distinct LLM-output shape
- TestStorageLayout gap-consumption tests — caught the OZ false-unsafe issue
- TestRemovingDefaultInternalVarNowDetectedAsUnsafe — auditor regression
- TestProxyUpgrade.test_works_offline_for_all_proxy_selectors — perf regression
- TestProxyUpgrade.test_non_upgrade_short_circuits_before_decode — perf regression
- Proxy-follow tests in test_source_context.py and test_on_chain_state.py
- Refine flag wiring + skip_simulation tests

223/239 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@spalen0 spalen0 merged commit e7d72bb into main May 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant