[Warning - work done by Claude/Opus 4.7] Bug fixes and core-loop / kernel / aggregation speedups by mckellardw · Pull Request #82 · dpeerlab/SEACells

mckellardw · 2026-04-29T14:30:05Z

Five correctness fixes:

cpu.py / cpu_dense.py / gpu.py: replace raise RuntimeWarning with warnings.warn so a non-converging fit doesn't abort and discard already-written labels.
evaluate.get_density: honor the key argument (was checking the literal string "key" and reading X_pca regardless).
core.sparsify_assignments / summarize_by_soft_SEACell: guard row / metacell divisions when all weights fall below minimum_weight, no more NaN/Inf in the metacell expression matrix.
build_graph.rbf_for_row: clamp zero entries of the adaptive-bandwidth denominator to eps to prevent NaN from duplicate cells.
genescores.prepare_multiome_anndata: cast SEACell labels to string consistently so integer labels no longer raise KeyError.

Five speedups, each numerically equivalent up to FP-ordering and validated against the prior implementation:

cpu.py _updateA / _updateB: in-place rank-1 dense FW updates plus incremental tracking of t1 @ A / K @ B, ~17x / ~8x.
build_graph rbf_for_row: distance only to row-i's neighbors; rbf: vstack rows instead of lil_matrix assignment loop, ~6.6x.
core.summarize_by_SEACell / summarize_by_soft_SEACell: replace per- metacell Python loop with one sparse matmul (indicator @ data, or (A.T @ data) / totals with zero-guard), ~14x / ~46x.
genescores: hoist rankdata over atac/rna metacell matrices once in get_gene_peak_correlations and pass precomputed ranks into _peaks_correlations_per_gene; free the raw-expression DataFrames after rank precompute, ~8x+.

Public function signatures, return types, and downstream contracts (meta_ad obs/var/layers, csr_matrix returns from _updateA/_updateB) are unchanged. cpu_dense.py and gpu.py FW loops are not touched.

Five correctness fixes: - cpu.py / cpu_dense.py / gpu.py: replace `raise RuntimeWarning` with `warnings.warn` so a non-converging fit doesn't abort and discard already-written labels. - evaluate.get_density: honor the `key` argument (was checking the literal string "key" and reading X_pca regardless). - core.sparsify_assignments / summarize_by_soft_SEACell: guard row / metacell divisions when all weights fall below minimum_weight, no more NaN/Inf in the metacell expression matrix. - build_graph.rbf_for_row: clamp zero entries of the adaptive-bandwidth denominator to eps to prevent NaN from duplicate cells. - genescores.prepare_multiome_anndata: cast SEACell labels to string consistently so integer labels no longer raise KeyError. Five speedups, each numerically equivalent up to FP-ordering and validated against the prior implementation: - cpu.py _updateA / _updateB: in-place rank-1 dense FW updates plus incremental tracking of t1 @ A / K @ B, ~17x / ~8x. - build_graph rbf_for_row: distance only to row-i's neighbors; rbf: vstack rows instead of lil_matrix assignment loop, ~6.6x. - core.summarize_by_SEACell / summarize_by_soft_SEACell: replace per- metacell Python loop with one sparse matmul (indicator @ data, or (A.T @ data) / totals with zero-guard), ~14x / ~46x. - genescores: hoist rankdata over atac/rna metacell matrices once in get_gene_peak_correlations and pass precomputed ranks into _peaks_correlations_per_gene; free the raw-expression DataFrames after rank precompute, ~8x+. Public function signatures, return types, and downstream contracts (meta_ad obs/var/layers, csr_matrix returns from _updateA/_updateB) are unchanged. cpu_dense.py and gpu.py FW loops are not touched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR delivers a batch of correctness fixes and major performance optimizations across the SEACells core optimization loop, kernel/graph construction, and metacell aggregation utilities, plus a small documentation addition for repo contributor guidance.

Changes:

Prevent non-convergence from aborting fits by switching from raise RuntimeWarning to warnings.warn (CPU dense + GPU + sparse CPU).
Fix metric correctness issues (e.g., evaluate.get_density now uses the provided key) and improve numerical robustness (e.g., RBF denominator zero-guard; soft aggregation zero-guards).
Speed up the main compute path via vectorized sparse matmuls in aggregation and incremental dense Frank–Wolfe updates in cpu.py, plus faster RBF construction and rank precomputation in genescores.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`SEACells/cpu.py`	Rewrites Frank–Wolfe inner loops for `_updateA/_updateB` to use in-place dense rank-1 updates and incremental tracking for large speedups; non-convergence now warns instead of raising.
`SEACells/cpu_dense.py`	Switch non-convergence handling from raising to warning to avoid aborting after labels have been written.
`SEACells/gpu.py`	Same non-convergence warning behavior change for GPU backend.
`SEACells/core.py`	Vectorizes hard/soft metacell aggregation via sparse matmul; adds division-by-zero guards to avoid NaN/Inf rows.
`SEACells/build_graph.py`	Speeds up RBF kernel construction by computing distances only to neighbors and stacking CSR rows; guards zero denominators to avoid NaNs from duplicates.
`SEACells/genescores.py`	Fixes integer SEACell label handling for metacell filtering; hoists rank computation out of the per-gene loop for multiome peak–gene correlations.
`SEACells/evaluate.py`	Fixes `get_density` to honor the caller-supplied `key` in `ad.obsm`.
`CLAUDE.md`	Adds repository contributor guidance (architecture, common commands, conventions).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    svd = pd.DataFrame(atac_mod_ad.obsm["X_svd"], index=atac_mod_ad.obs_names)
+    summ_svd = svd.groupby(atac_mod_ad.obs[SEACells_label]).mean()
+    atac_meta_ad.obsm["X_svd"] = summ_svd.loc[atac_meta_ad.obs_names, :].values
+


+        # CSC view for fast column slicing K[:, amins] inside the loop.
+        K_csc = K.tocsc() if hasattr(K, "tocsc") else K


+
+- Formatting is enforced via pre-commit (black, isort, ruff with `--fix`, prettier, blacken-docs). Run `pre-commit run --all-files` before opening a PR; the hooks autofix most issues.
+- `python_requires=">=3.8"` per `setup.py`. The conda `environment.yaml` is stale (pins 3.5) and the README's option (4) is the working install path.
+- GPU backend depends on CuPy and is imported lazily — do not add a top-level `import cupy` anywhere.


Copilot AI review requested due to automatic review settings April 29, 2026 14:30

Copilot started reviewing on behalf of mckellardw April 29, 2026 14:30 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Warning - work done by Claude/Opus 4.7] Bug fixes and core-loop / kernel / aggregation speedups#82

[Warning - work done by Claude/Opus 4.7] Bug fixes and core-loop / kernel / aggregation speedups#82
mckellardw wants to merge 1 commit intodpeerlab:mainfrom
mckellardw:main

mckellardw commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# CSC view for fast column slicing K[:, amins] inside the loop.
		K_csc = K.tocsc() if hasattr(K, "tocsc") else K

Conversation

mckellardw commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants