Compound Pearson residuals for single-cell RNA-seq data without UMIs

This repository holds the code needed to reproduce the analyses and figures presented in our preprint Lause et al. (2023), including additional analysis that was requested during peer review for a journal submission. The code for the earlier version of the preprint can be found under earlier releases (v1, v2). Code release v1 corresponds to v1 of the preprint, code release v2 to the current version of the preprint. Release v3.0 contains the most recent revision analyses.

Code

Some of the notebooks and R scripts depend on each other and are best run in the order indicated. For the R scripts, use our separate R environment (see below for setup instructions). Notebooks/Scripts 1-18 use the Tasic 2018 dataset. Notebooks 19-24 are based on the reads-per-UMI tables from the Ziegenhain/Hagemann-Jensen datasets.

Datasets

Download the reads-per-UMI tables from zenodo and save them to .data/reads_per_umi_tables/. R code to obtain the same tables from the public raw data is available in data/reads_per_umi_tables/prepare_data.R.
Download the Tasic raw count data from brain-map.org via the Gene-level (exonic and intronic) read count values for all samples (zip) link. From these *.zip files, extract the mouse_ALM_2018-06-14_exon-matrix.csv and mouse_VISp_2018-06-14_exon-matrix.csv to .data/tasic/.
All required metadata tables are contained in this repository for convenience.

Compute environment

We ran all notebooks in Python 3.8.10 on an Ubuntu machine with 40 CPUs and 440 GB RAM. The following package versions were used:

scanpy 1.9.0
anndata 0.8.0
sklearn 1.0.2
numpy 1.21.5
matplotlib 3.5.1
openTSNE 0.6.0
pandas 1.4.1
seaborn 0.11.2
mygene 3.2.2.
scipy 1.8.0

Census and qUMI where run in a separate R conda environment specified in r41_env.yml. To install it, create the environment from that file with

conda env create -f r41_env.yml

Then, to install qUMI, activate the environment with conda activate r41_env_full, start R and run

remotes::install_github("willtownes/quminorm")

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
figures		figures
.gitignore		.gitignore
01_prepare_tasic.ipynb		01_prepare_tasic.ipynb
02_simulate_tasic_like_data.ipynb		02_simulate_tasic_like_data.ipynb
03_plot_tasic_homo_Fig2_S1_S3_S5.ipynb		03_plot_tasic_homo_Fig2_S1_S3_S5.ipynb
04_tasic_parameter_estimation_Fig_S2_S7.ipynb		04_tasic_parameter_estimation_Fig_S2_S7.ipynb
05_tasic_parameter_estimation_celltypes_Fig_S6.ipynb		05_tasic_parameter_estimation_celltypes_Fig_S6.ipynb
06_compute_qumis_census_sim1.R		06_compute_qumis_census_sim1.R
07_compute_qumis_census_sim2.R		07_compute_qumis_census_sim2.R
08_load_simulation_exps.ipynb		08_load_simulation_exps.ipynb
09_plot_simulation_exps_Fig3.ipynb		09_plot_simulation_exps_Fig3.ipynb
10_prepare_tasic_hetero.ipynb		10_prepare_tasic_hetero.ipynb
11_plot_tasic_hetero_Fig4_S8_S9.ipynb		11_plot_tasic_hetero_Fig4_S8_S9.ipynb
12_compute_tasic_hetero_qumis_census.R		12_compute_tasic_hetero_qumis_census.R
13_load_tasic_hetero_qumi_census.ipynb		13_load_tasic_hetero_qumi_census.ipynb
14_plot_tasic_hetero_method_comparison_Fig_S10.ipynb		14_plot_tasic_hetero_method_comparison_Fig_S10.ipynb
15_prepare_runtime_exp.ipynb		15_prepare_runtime_exp.ipynb
16_time_qumis_census.R		16_time_qumis_census.R
17_time_qumis_census_100k.R		17_time_qumis_census_100k.R
18_time_scanpy_residuals.ipynb		18_time_scanpy_residuals.ipynb
19_prepare_ziegenhain.ipynb		19_prepare_ziegenhain.ipynb
20_plot_ziegenhain_Fig5_Fig6_Tab1.ipynb		20_plot_ziegenhain_Fig5_Fig6_Tab1.ipynb
21_plot_ziegenhain_S4.ipynb		21_plot_ziegenhain_S4.ipynb
22_plot_alpha_per_cell_S11_S12.ipynb		22_plot_alpha_per_cell_S11_S12.ipynb
23_prepare_tasic_homo_zinb.ipynb		23_prepare_tasic_homo_zinb.ipynb
24_plot_tasic_homo_zinb_Fig7.ipynb		24_plot_tasic_homo_zinb_Fig7.ipynb
LICENSE		LICENSE
README.md		README.md
r41_env.yml		r41_env.yml
readcount_tools.py		readcount_tools.py
runtimes_100k.log		runtimes_100k.log
runtimes_scanpy_residuals.log		runtimes_scanpy_residuals.log
runtimes_up_to_50k.log		runtimes_up_to_50k.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compound Pearson residuals for single-cell RNA-seq data without UMIs

Code

Datasets

Compute environment

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Compound Pearson residuals for single-cell RNA-seq data without UMIs

Code

Datasets

Compute environment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages