[Warning - work done by Claude/Opus 4.7] Bug fixes and core-loop / kernel / aggregation speedups#82
Open
mckellardw wants to merge 1 commit intodpeerlab:mainfrom
Open
[Warning - work done by Claude/Opus 4.7] Bug fixes and core-loop / kernel / aggregation speedups#82mckellardw wants to merge 1 commit intodpeerlab:mainfrom
mckellardw wants to merge 1 commit intodpeerlab:mainfrom
Conversation
Five correctness fixes: - cpu.py / cpu_dense.py / gpu.py: replace `raise RuntimeWarning` with `warnings.warn` so a non-converging fit doesn't abort and discard already-written labels. - evaluate.get_density: honor the `key` argument (was checking the literal string "key" and reading X_pca regardless). - core.sparsify_assignments / summarize_by_soft_SEACell: guard row / metacell divisions when all weights fall below minimum_weight, no more NaN/Inf in the metacell expression matrix. - build_graph.rbf_for_row: clamp zero entries of the adaptive-bandwidth denominator to eps to prevent NaN from duplicate cells. - genescores.prepare_multiome_anndata: cast SEACell labels to string consistently so integer labels no longer raise KeyError. Five speedups, each numerically equivalent up to FP-ordering and validated against the prior implementation: - cpu.py _updateA / _updateB: in-place rank-1 dense FW updates plus incremental tracking of t1 @ A / K @ B, ~17x / ~8x. - build_graph rbf_for_row: distance only to row-i's neighbors; rbf: vstack rows instead of lil_matrix assignment loop, ~6.6x. - core.summarize_by_SEACell / summarize_by_soft_SEACell: replace per- metacell Python loop with one sparse matmul (indicator @ data, or (A.T @ data) / totals with zero-guard), ~14x / ~46x. - genescores: hoist rankdata over atac/rna metacell matrices once in get_gene_peak_correlations and pass precomputed ranks into _peaks_correlations_per_gene; free the raw-expression DataFrames after rank precompute, ~8x+. Public function signatures, return types, and downstream contracts (meta_ad obs/var/layers, csr_matrix returns from _updateA/_updateB) are unchanged. cpu_dense.py and gpu.py FW loops are not touched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR delivers a batch of correctness fixes and major performance optimizations across the SEACells core optimization loop, kernel/graph construction, and metacell aggregation utilities, plus a small documentation addition for repo contributor guidance.
Changes:
- Prevent non-convergence from aborting fits by switching from
raise RuntimeWarningtowarnings.warn(CPU dense + GPU + sparse CPU). - Fix metric correctness issues (e.g.,
evaluate.get_densitynow uses the providedkey) and improve numerical robustness (e.g., RBF denominator zero-guard; soft aggregation zero-guards). - Speed up the main compute path via vectorized sparse matmuls in aggregation and incremental dense Frank–Wolfe updates in
cpu.py, plus faster RBF construction and rank precomputation ingenescores.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
SEACells/cpu.py |
Rewrites Frank–Wolfe inner loops for _updateA/_updateB to use in-place dense rank-1 updates and incremental tracking for large speedups; non-convergence now warns instead of raising. |
SEACells/cpu_dense.py |
Switch non-convergence handling from raising to warning to avoid aborting after labels have been written. |
SEACells/gpu.py |
Same non-convergence warning behavior change for GPU backend. |
SEACells/core.py |
Vectorizes hard/soft metacell aggregation via sparse matmul; adds division-by-zero guards to avoid NaN/Inf rows. |
SEACells/build_graph.py |
Speeds up RBF kernel construction by computing distances only to neighbors and stacking CSR rows; guards zero denominators to avoid NaNs from duplicates. |
SEACells/genescores.py |
Fixes integer SEACell label handling for metacell filtering; hoists rank computation out of the per-gene loop for multiome peak–gene correlations. |
SEACells/evaluate.py |
Fixes get_density to honor the caller-supplied key in ad.obsm. |
CLAUDE.md |
Adds repository contributor guidance (architecture, common commands, conventions). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+52
to
+55
| svd = pd.DataFrame(atac_mod_ad.obsm["X_svd"], index=atac_mod_ad.obs_names) | ||
| summ_svd = svd.groupby(atac_mod_ad.obs[SEACells_label]).mean() | ||
| atac_meta_ad.obsm["X_svd"] = summ_svd.loc[atac_meta_ad.obs_names, :].values | ||
|
|
Comment on lines
+519
to
+520
| # CSC view for fast column slicing K[:, amins] inside the loop. | ||
| K_csc = K.tocsc() if hasattr(K, "tocsc") else K |
|
|
||
| - Formatting is enforced via pre-commit (black, isort, ruff with `--fix`, prettier, blacken-docs). Run `pre-commit run --all-files` before opening a PR; the hooks autofix most issues. | ||
| - `python_requires=">=3.8"` per `setup.py`. The conda `environment.yaml` is stale (pins 3.5) and the README's option (4) is the working install path. | ||
| - GPU backend depends on CuPy and is imported lazily — do not add a top-level `import cupy` anywhere. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Five correctness fixes:
raise RuntimeWarningwithwarnings.warnso a non-converging fit doesn't abort and discard already-written labels.keyargument (was checking the literal string "key" and reading X_pca regardless).Five speedups, each numerically equivalent up to FP-ordering and validated against the prior implementation:
Public function signatures, return types, and downstream contracts (meta_ad obs/var/layers, csr_matrix returns from _updateA/_updateB) are unchanged. cpu_dense.py and gpu.py FW loops are not touched.