feat: aa.plot_eval_heatmap — house-preset evaluation heatmap (prototype #310)#317
Draft
breimanntools wants to merge 7 commits into
Draft
feat: aa.plot_eval_heatmap — house-preset evaluation heatmap (prototype #310)#317breimanntools wants to merge 7 commits into
breimanntools wants to merge 7 commits into
Conversation
#310) Consolidate the hand-built seaborn evaluation-heatmap block (duplicated in the gamma-secretase notebook cells 12/25 and the original project) into one top-level function with the house preset: viridis, fixed [vmin, vmax] color limits, integer annotations, labeled colorbar, horizontal ticks. Returns the Axes; library code never calls plt.show()/tight_layout(). The simple static sibling of the adaptive aap.plot_eval (documented in See Also). - aaanalysis/plotting/_plot_eval_heatmap.py: frontend with Validate block - wire to public API: plotting/__init__.py + __init__.py/__all__ + api.rst - numpydoc + Examples include; executed example notebook - 21 unit tests incl. seaborn-block equivalence (KPI) + bad-input ValueError - release notes Unreleased + cheat sheet entry Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…, correct seaborn linewidths Critical-review pass on #310: - Long/normal-length column labels overlapped into unreadable mush because the x-tick labels were forced horizontal (rotation=0) with no sizing control. Add xtick_rotation / ytick_rotation (right-aligned when rotated) and a figsize arg; the committed example's long-label panel now renders clean. - Use seaborn's documented linewidths= (was the ambiguous kwargs-passthrough linewidth=, which collides with seaborn's own linewidths=0 default). - Compute the all-numeric column check once; add the # I Helper Functions skeleton header; validate the new params in the Validate block. - Tests + example notebook updated and re-executed (inline backend). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adversarial correctness pass found no defect; NaN cells render blank and never raise. Guard NaN-graceful annotation and integer-dtype acceptance. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Use idiomatic df.empty for the min-shape guard and a single select_dtypes(exclude='number') for the all-numeric guard (drops the length compare + O(n^2) membership comprehension). Behavior identical. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…bel/ytick_rotation Completes positive+negative coverage for every public param. Docstring checker clean (0 defects; RAISES-UNDOCUMENTED advisory is house-consistent — no plotting sibling documents Raises). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…for evaluation maps) Evaluation maps should read as a square grid of equal-sized cells; sns.heatmap defaulted square=False (cells stretched to the figure). Add a square: bool=True param (opt out with square=False). Additive; equivalence test unaffected (checks data/clim/cmap/annotations, not aspect).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Status: SOLID — works, green local tests, full ripple done. New public function
aa.plot_eval_heatmap, all 482 plotting+api tests pass, the example notebookre-executes under nbmake, and a regression test pins byte-equivalence to the seaborn
block this consolidates.
What to review first
aaanalysis/plotting/_plot_eval_heatmap.py— the frontend + Validate block.See Alsoin the docstring documenting the relationship toaap.plot_eval(this is the simple static sibling;
aap.plot_evalis the adaptive sweepsummarizer). Confirm this framing is the intended reconciliation for feat: aa.plot_eval_heatmap — house-preset evaluation heatmap (consolidate duplicated seaborn blocks) #310's
"single static-grid heatmap entry point" requirement.
__init__.py/__all__re-export (CONFIRM-FIRST, additive — on thewire-to-public-API list for this epic).
Open API question
aap.plot_eval." Thisprototype chose a standalone top-level helper (
aa.plot_eval_heatmap) anddocuments the relationship rather than folding it into
aap.plot_eval. If you'drather expose it as a mode/flag on
aap.plot_eval, that is a small follow-up.Part of #305 / prototype for #310.
Summary
Consolidates the hand-built seaborn evaluation-heatmap block — duplicated in the
γ-secretase notebook cells 12/25 and a further 3× in the original project — into one
call with the house preset:
viridis, fixed[vmin, vmax]color limits,annotwithfmt=".0f",linewidth=0.1, labeled colorbar, no left/bottom ticks, horizontal tick labels.Returns the
Axes; library code never callsplt.show()/tight_layout().Full ripple
versionadded, namedReturns,See Also,Examplesinclude.
examples/plotting/plot_eval_heatmap.ipynb(inlinebackend, embedded PNG outputs,
plt.tight_layout()+plt.show()in the notebookonly).
Axes; respectsvmin/vmax/labels/cbar_label/ax;annotation count; horizontal ticks; bad input →
ValueError; and aseaborn-block equivalence regression test (KPI).
plotting/__init__.py, top-level__init__.py/__all__,and
docs/source/api.rst.Notes / not done
the full paper pipeline (heavy CPP/dPULearn runs) and is concurrently modified in the
main checkout, so re-executing it overnight was out of scope. Equivalence to those
cells is instead proven by
test_matches_raw_seaborn_block(same heatmap array,clims, colormap, and annotations). Swapping the cells onto the new call is a trivial
maintainer follow-up.
🤖 Generated with Claude Code
Critical self-review (post-prototype pass)
Rendered the function to PNG under the house context (
aa.plot_settings(),font 18) and inspected — found and fixed real defects:
Defects found + fixed
was visible in the committed example notebook's own cell-2 output
(
25 feature50 f...). Forcing x labels horizontal (rotation=0) with no sizingcontrol breaks for any non-trivial column-label length, including the realistic
γ-secretase case (
scale61 scale60 scale75 CPP). Addedxtick_rotation/ytick_rotation(non-zero x rotation is right-aligned) and afigsizearg(consistent with the
plot_ranksibling). The example's long-label panel nowrenders clean; re-executed with the inline backend.
linewidth=0.1was an ambiguous kwargs pass-through that collides withseaborn's own
linewidths=0default onpcolormesh. Switched to the documentedlinewidths=0.1. Visual/equivalence unchanged (the KPI test asserts array /clim / cmap / annotations, not line width).
select_dtypestwice — nowonce; added the
# I Helper Functionsskeleton header; validate the new paramsin the Validate block; tightened the
See Also.Confirmed correct (not changed)
Axes(not(fig, ax)) — matches issue feat: aa.plot_eval_heatmap — house-preset evaluation heatmap (consolidate duplicated seaborn blocks) #310's explicitrequirement; library code never calls
plt.show()/tight_layout()/plot_settings().light) — legible across the viridis range in the renders.
cmap="viridis"kept as a literal:plot_get_cmaponly ships the divergingCPP/SHAP palettes (wrong for a sequential 50-100 accuracy scale); viridis is the
house preset named in the issue, so no preset applies.
Tests / gates (worktree-authoritative import verified)
xtick_rotation(rotation +ha),ytick_rotation,figsize,and bad-type rejection.
plotting_tests312 passed,api_tests175 passed(param-coverage, backend-import-hygiene, return-contract all green), docstring
structural checker clean (only the sibling-consistent RAISES advisory), and
--nbmakeon the example notebook passes.Residual
the new call (that notebook is heavy and concurrently edited in the main
checkout); equivalence to those cells stays proven by
test_matches_raw_seaborn_block. Trivial maintainer follow-up.Iterative review log
test_nan_cells_render_gracefully,test_integer_grid_accepted) to guard the NaN-graceful and integer-dtype behavior (a sweep table can legitimately carry NaN for a failed config). No stochastic path → reproducibility N/A.aa.plot_settings()(font 18) and default rcParams, across small (2×3), large (8×10), and long-label (45° rotated) grids. Verified: seaborn per-cell luminance contrast is legible across the whole viridis range (white on dark blue/teal, dark on green/yellow); annotation/tick/colorbar-label/colorbar-tick font sizes all scale correctly with rcParams (18/16.5/18/16.5 underplot_settings, 10 at default); colorbar is labeled; long labels right-align cleanly with no overlap/clipping;set_xticklabelsrenders with zero warnings (warnings-as-errors check passed); library code contains noplt.show/tight_layout/plot_settings. Committed example-notebook outputs re-verified clean. Non-square aspect kept deliberately to stay byte-equivalent to the consolidated seaborn block (ADR-0032).len(df_eval) == 0 or df_eval.shape[1] == 0→ idiomaticdf_eval.empty; and the all-numeric check fromselect_dtypes(include=...)+ length-compare + an O(n²) membership comprehension (error path) down to a singleselect_dtypes(exclude="number").columns.tolist(). Draw path already fully vectorized (no loops/iterrows/copies); verified byte-identical rejection on empty/bool/datetime and acceptance on float/int grids.ut.check_*; no ADR refs, no print, bare ValueError, viridis literal justified (no matchingutpreset for a sequential 50–100 scale). Added negative-type tests forylabel/cbar_label/ytick_rotationso every public param now has positive+negative coverage. Full local gate green: heatmap tests + param-coverage + backend-import-hygiene + plot-return-contract + docstring-contracts (113 passed). Example notebook re-executes cleanly against the worktree code (verified cell-by-cell); the local--nbmakemiss is only the worktree/editable-install split (kernel resolves the installed main-checkout package), which does not occur in CI where the merged package is pip-installed.