feat(plotting): dense-plot overlap gate + auto_font option + feature_map auto-size (#219)#326
feat(plotting): dense-plot overlap gate + auto_font option + feature_map auto-size (#219)#326breimanntools wants to merge 27 commits into
Conversation
…ase 0, red) Add the first layout/visual assertions in the suite: a forced-render bbox overlap detector scoped to row-label text artists, plus a high-row df_feat fixture (up to the full 74-subcategory AAontology breadth). feature_map row labels overlap once the grid grows: the gate currently fails at 55 and 74 subcategories (0/0/54/73 overlaps at 20/36/55/74) and passes at 20/36. Phase 1 (shrink-to-floor-then-grow) turns it green. The detector self-tests and the font-floor checks already pass. Helper: tests/unit/plotting_tests/_text_overlap.py (get_label_overlaps, make_dense_df_feat). Plain matplotlib bbox introspection, no image snapshots. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…issue #219, phase 1) The scale subcategory row labels of feature_map / heatmap are hand-placed text artists at the rcParams size (~10pt). On dense grids they collide into an unreadable column. PlotElements.add_subcat_bars now measures the rendered row labels and, only when they overlap, shrinks their font stepwise to a ~5pt legibility floor (optimize_subcat_label_fontsize). Sparse maps already fit, so their labels and output are untouched. Turns the phase-0 gate green: row-label overlaps at 20/36/55/74 subcategories go 0/0/54/73 -> 0/0/0/0; fonts stay >= 5pt (10/10/8.5/6.5). Fix lives in the shared plot_heatmap_ path, so standalone CPPPlot.heatmap benefits too. The separate importance-bar numeric-tick overlap ('40' over '0') no longer reproduces (the current ticks_0 show_only_max path resolved it); added a per-axis numeric-tick regression guard to lock the KPI in. Figure-height growing (when even the floor would collide) is deferred to phase 2's auto_font size auto-derivation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…o-derivation (issue #219, phase 2) Add the global aa.options["auto_font"] toggle (default False -> byte-identical output; validated as bool via _check_option; check_auto_font() re-exported through the utils barrel). When enabled and the user keeps the default figsize, CPPPlot.feature_map derives its figure size from the grid shape (n_subcat x n_positions) via derive_feature_map_figsize, so cells stay legible as the grid grows; an explicit figsize always wins. On a 74-subcategory map this grows the figure to ~(8.2, 12.2) and lifts the row-label font from the 5pt floor (auto_font off) back to ~8pt, with 0 overlaps. This is the "grow" dimension deferred from phase 1. Tests: config option valid/invalid + golden-keys/roundtrip updated; auto_font off keeps (8,8); on derives a larger dense figure, improves legibility, keeps 0 overlaps and >=5pt; explicit figsize overrides; derivation monotonic + bounded. Scope: this slice wires auto_font into the signature dense composite (feature_map). Broadening font auto-optimization to every *Plot method, and the ±15% constant-cell target at very small grids (fixed label/colorbar overhead dominates there), are follow-ups. Notebook re-execution deferred to the canonical env (auto_font off by default -> only a dense feature_map changes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…#219, phase 3) Add opt-in image-comparison baselines for the signature dense composites (feature_map, heatmap, profile) using pytest-mpl, which is already a dev dependency but was dormant (no markers, no baselines). These catch silent visual degradation the type/return-shape tests cannot. Opt-in by design: pytest-mpl only compares against the committed baselines when --mpl is passed, so the blocking CI matrix (no --mpl) just runs the functions. Run locally with `pytest tests/unit/plotting_tests/test_visual_regression.py --mpl`; regenerate with --mpl-generate-path when an intended visual change lands. Baselines are matplotlib/freetype-version sensitive, hence not wired into the gating matrix. No new dependency added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n sparse maps (issue #219, review loop 1) - feature_map: guard the figsize default-detection against None. With auto_font on, figsize=None (or the (8,8) default) now auto-derives instead of raising TypeError from tuple(None). (Bare figsize=None was already unsupported downstream with auto_font off; unchanged there.) - optimize_subcat_label_fontsize: add a conservative cheap pre-check (rows vs a 0.6 axes-height fraction) so the common sparse map skips the forced render; only render+measure when overlap is plausible. Errs toward rendering, never toward wrongly skipping an overlap. Overlaps stay 0 at 20/36/55/74; new tests cover figsize=None under auto_font. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lation guard (issue #219, review loop 2) - Lock in that standalone CPPPlot.heatmap benefits from the shared-path row-label shrink (0 overlaps at 20/74), per the issue's "extend the gate to heatmap" requirement. - Add an isolation guard asserting auto_font resets to False each test, confirming the new option is covered by the existing _dict_options autouse reset fixture (no cross-test leakage). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ote (issue #219, review loop 3) - Avoid a new reportOptionalOperand diagnostic: the n_positions sum used the Optional-typed jmd lengths directly; guard with `or 0` (they are validated non-None above) so the advisory pyright count does not rise above the 887 baseline. - Add a .. versionchanged:: 1.1.0 note to feature_map's figsize documenting the auto_font grid-shape derivation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ew (issue #219, review loop 4) Adversarial review found no defects; guard the verified robustness properties so they cannot regress: single-subcategory frame (optimizer early-returns, plot still succeeds), col_cat='category' (few rows, no shrink) and col_cat='scale_name' (highest-density label axis) both clear all overlaps. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ers) get_optimal_fontsize gains fill=False; when True it drops the required inter-char gap to 0 so sequence characters grow until adjacent residues touch (no whitespace) while still never overlapping. Threaded through update_seq_size_/_add_part_seq/ add_tmd_jmd_seq/plot_heatmap_/plot_feature_map and exposed as CPPPlot.feature_map seq_char_fill=False. Default off = byte-identical (regression-tested).
Hold the heatmap grid at a constant width:height = 1:1.15 box via set_box_aspect whenever auto_font auto-sizes the figure, so the feature map looks the same regardless of the number of subcategories/positions. derive_feature_map_figsize now grows the figure WIDTH with the longest subcategory row-label (names never clipped) and tracks the grid height off the constant aspect, instead of growing with n_subcat. auto_font off (default) leaves the grid untouched -> byte-identical.
breimanntools
left a comment
There was a problem hiding this comment.
The plots do not look better. This imrpovement did not work!! Do it again and make a before after comparison to demonstrate me that it works. I want robust outscale methods to show for sequences with flexible length optimal plots!!!
…rid-aspect + default seq_char_fill=True Two follow-ups requested after review of the grid-aspect / seq_char_fill work: - Tick fix: set_box_aspect (the 1:1.15 auto_font grid) triggered a relayout that re-exposed the shared position/importance x-tick labels on the top-row axes (top importance bar + empty top-right cell), so '1/10/30/40' and '0.0/0.5/1.0' appeared ABOVE the grid. Now, after the final layout, the top-row axes have their x-tick labels hidden, so position ticks stay on the heatmap (bottom) only - matching the non-auto_font look. Grid aspect itself is kept. - seq_char_fill now defaults to True: residue characters fill edge-to-edge (no whitespace, still never overlapping) by default, for larger/cleaner sequence letters. Updated the docstring (versionchanged 1.1.0) and the default-behavior test accordingly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (83.76%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #326 +/- ##
==========================================
- Coverage 94.93% 94.85% -0.09%
==========================================
Files 185 186 +1
Lines 17883 17993 +110
Branches 3038 3065 +27
==========================================
+ Hits 16978 17067 +89
- Misses 598 608 +10
- Partials 307 318 +11
... and 1 file with indirect coverage changes
🚀 New features to boost your workflow:
|
…ered scale-sweep gallery (issue #219) - feature_map: hide the top-row axes' redundant shared x-tick labels in ALL renders (not only under auto_font). The top importance bar shares the position x-axis with the heatmap, so 1/10/30/40 (and 0.0/0.5/1.0) were drawn ABOVE the grid whenever a figure is saved bbox_inches="tight" (notebooks/inline backend, publication savefig). Now position ticks live on the heatmap bottom only, at every data scale. Pre-existing in master; this is the general fix. - Visual baselines now use realistic bundled DOM_GSEC features (load_features) instead of the synthetic single-position stress fixture, and are regenerated. Baselines must look like real output so a human can trust a diff. - Add .github/scripts/plot_gallery.py: renders plots across 5 data scales (~10% "tiny" to ~1000% "huge") on realistic data, ALWAYS bbox_inches="tight" (the way plots are consumed — hashing the raw canvas silently misses layout changes like these ticks). Step count scales with figure complexity: key figures (feature_map/heatmap) 5 steps, medium (profile/ranking) 3, simple plots + evals 1. Emits per-plot hashes for byte-exact A/B compare across versions. Verified: all 18 renders succeed, 10% and 1000% both render cleanly (labels legible, no overlap, ticks on the bottom). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ot gallery
Extends the gallery with the most layout-sensitive path: feature_map and heatmap
rendered WITH a TMD-JMD sequence across TMD lengths 5/10/20/40/80/100 aa, in both
plain and CPP-SHAP (shap_plot=True) modes. Uses whole-TMD segment features so the
df_feat stays valid at any tmd_len, and includes feat_impact/mean_dif columns for
the SHAP variant.
Verified visually: short sequences (5 aa) render large edge-to-edge residue
letters (seq_char_fill), long sequences (100 aa, 120 columns) shrink-to-fit with
no overlap and no stray top ticks; all 42 renders (scale sweep + AAclust +
sequence x {plain,shap}) succeed. Gallery lands in the outdir for visual review.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tters stay legible (issue #219) Under auto_font, when a TMD-JMD sequence is shown, the figure width now grows with the number of residue positions so each residue column keeps a minimum on-figure width (SEQ_MIN_RESIDUE_WIDTH_IN=0.12in, capped at 30in) instead of shrinking the letters toward nothing. adjust_figsize_for_sequence() computes the widened size and drops the constant 1:1.15 grid aspect for long sequences (a forced square would squeeze the wide sequence back in). Short sequences keep the constant aspect. Verified 5->100 aa: width 10.3->17.4in, per-residue width held at ~0.12in floor; 100-aa TMD letters now legible (were ~3pt). auto_font off (default) unchanged. Gallery sequence sweep now renders under auto_font to exercise this. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…15 -> 1.05 - add_xticks now sets direction="out" so the TMD-JMD position tick marks (1/10/30/40) point DOWN below the part bar, not up into the heatmap cells. - Lower the auto_font grid aspect from 1.15 to 1.05 (height:width) so the grid is less tall / closer to square. Test references the FEATURE_MAP_GRID_ASPECT constant instead of a hardcoded value. - Regenerate the feature_map/heatmap visual baselines. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…regression The committed baseline PNGs rendered in the test harness's raw-matplotlib context (no aa.plot_settings()), which crammed the colorbar to 5 ticks and clipped the importance-bar labels - nothing like the clean plots users actually see (verified pixel-identical to 1.0.3 under real usage). They were fragile (render-context / matplotlib-version sensitive), opt-in (not in the CI gate), and misrepresented the plots in the PR diff. Visual quality remains guarded by the robust structural overlap gate (test_dense_label_overlap.py: forced-render bbox overlap, ticks-bottom-only, font >= 5pt) and the on-demand scale/sequence gallery (.github/scripts/plot_gallery.py). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… baselines) Reinstate pytest-mpl visual regression, done right this time: the baselines are rendered the way plots are actually consumed - aa.plot_settings() (house rcParams) + bbox_inches="tight" + fixed dpi - so the committed baseline images look like real output (clean 3-tick colorbar, unclipped labels, ticks on the bottom), not the raw-matplotlib artifact the earlier baselines showed. pytest compares the rendered pixels against the committed baseline within an RMS tolerance, catching any future visual change. Opt-in via --mpl (pixel comparison is matplotlib/freetype-version sensitive), so it is not wired into the blocking matrix; the environment-independent guarantees stay in test_dense_label_overlap.py. pytest tests/unit/plotting_tests/test_visual_regression.py --mpl Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Dedicated single-environment pixel gate for the signature CPP plots so the --mpl comparison actually blocks on visual regressions (unlike the opt-in local run). Runs on ubuntu-latest, Python 3.11, matplotlib pinned to 3.11.0, on PRs to master that touch plotting code or the plotting tests. Because pixel rendering is OS/font/matplotlib-version sensitive, the baselines are meant to be generated on THIS runner (workflow_dispatch with regenerate=true uploads them as an artifact) so comparison is self-consistent. On a failing gate the diff images are uploaded as an artifact for inspection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sion gate pytest-mpl pixel comparison is OS/font sensitive (Arial on macOS vs DejaVu on the Linux runner), so replace the locally-generated baselines with the ones rendered by the Visual Regression workflow's own ubuntu-latest / matplotlib==3.11.0 environment. Baselines and comparison now share the exact runner, so the gate is self-consistent and green. Verified the Linux render is clean (3-tick colorbar, labels fit, ticks on the bottom). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
breimanntools
left a comment
There was a problem hiding this comment.
I am not satisfied with the results. We must take a good care of this PR in the end
Previously auto_font fixed the whole grid BOX to a constant aspect, so individual heatmap cells stretched/squashed with the row:column ratio. Now the per-CELL shape is held at width:height = 1 : FEATURE_MAP_CELL_ASPECT (1.1) via set_box_aspect(cell_aspect * n_subcat / n_positions), so every cell renders the same shape regardless of how many subcategories/positions the grid has. - set_box_aspect is applied AFTER tight_layout (tight_layout would otherwise reset it), and the figure height is kept modest (not grown with n_subcat) so the box aspect is always satisfiable. Verified cell h/w = 1.10 at 8/24/55/74 subcategories. - derive_feature_map_figsize simplified accordingly (width grows with positions + row-label length; square-ish height). Removed the now-redundant adjust_figsize_for_sequence helper (long sequences are handled by the width term). - Gallery scale-sweep now renders composites under auto_font so it demonstrates the constant cell shape across scales. auto_font off (default) is unchanged -> byte-identical; the pixel-comparison baselines (auto_font off) are unaffected. All plotting/feature-map tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rework auto_font so the composite CPP plots (feature_map, heatmap, profile, ranking) hold each grid cell at a constant physical size and grow the figure with the data, instead of shrinking cells/fonts. Fonts (in points) stay at the plot_settings values regardless of grid size — no manual font_scale needed. - New _utils_cpp_plot_sizing.py: fit_cells_by_rescale (measure the grid's figure-fraction after layout, rescale the whole figure so cells hit a target inch size; siblings keep their fractions, fonts stay fixed), fit_width_by_rescale (profile, width-only), ranking_figheight (0.22*n+1). - feature_map/heatmap/profile/ranking wired to the constant-size path. - Replace the box_aspect (constant-ratio) + width-clamp mechanism. - Demote the 5pt shrink gate to a fallback via optimize_labels, fired only when auto_font is on AND the caller forces a fixed figsize. - Flip the auto_font default to True; auto_font=False reproduces the prior fixed-size output. - Rewrite the layout tests to assert the invariants (constant cell size, font neither shrunk nor enlarged, no overlap, figure grows, forced-figsize fallback). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…q_char_fill byte-identity Adversarial review fixes: - Floor the rescale at the entry (default) size: auto_font only GROWS the figure, never shrinks below (8,8)/(7,5). Fixes small grids (n<~26) collapsing to a degenerate figure with overflowing fonts now that auto_font is default-on. Cells stay constant above the floor; below it the figure sits at the default. - Normalize figsize=None to the default in feature_map/heatmap/profile up-front, so figsize=None works regardless of auto_font state (was a TypeError when off). - seq_char_fill defaults to None = follow auto_font (on under auto-sizing, off when auto_font=False so that path stays byte-identical); explicit True/False wins. - Emit a verbose-gated warning when the SAFETY_CAP is hit (was silent). - Tests: constant-cell invariant now on dense grids; add small-grid floor tests, figsize=None-in-both-states, seq_char_fill-off-when-auto_font-off. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… default-on The auto_font=True default resizes the composite plots (feature_map/heatmap/profile/ ranking) to constant-cell-size figures; refresh the committed notebook outputs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Demonstrate the auto_font option (default on) with feature_map rendered on the DOM_GSEC feature set at auto_font True vs False, so users can eyeball the constant-cell-size behavior. Naturally wired via the options example include (no orphan doc). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Update the two config tests that pinned the pre-flip default (options default dict + the default-value assertion) to match auto_font=True. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…he structure explorer An explicit figsize (including (8,8)) now always wins over auto_font: feature_map's figsize default becomes None, and auto-sizing applies only when figsize is OMITTED. Previously (8,8) was the auto sentinel, so an embedded consumer could not pin a size. CPPStructurePlot.explore/plot_combined embed feature_map at a fixed size and map its axes fractions to pixels, so they must not auto-size; pin figsize=(8,8) on all three feature_map calls (a caller-supplied figsize still wins). Fixes the distorted feature map in the interactive per-site SHAP viewer under the auto_font default. +1 test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Makes the dense CPP composite plots stable across data sizes (issue #219), reworked to hold grid cell size constant.
Mechanism
auto_font=True(now the default) holds each grid cell at a constant physical size and grows the figure with the data, so subcategory labels, position ticks and residue letters stay at a constant, legible font regardless of grid size — no manualplot_settings(font_scale=...). It only ever GROWS the figure (floored at the default), so sparse grids never collapse.auto_font=Falsereproduces the previous fixed-size output (byte-identical). Coversfeature_map,heatmap,profile, andranking.Testing
figsize=Nonein both states,seq_char_fillfollows auto_font). Full plotting + cpp_plot suites green (666).Closes nothing (issue #219 stays open for the broader font work); part of #219.
🤖 Generated with Claude Code