GitHub - RohatgiLab/Base_editing_analysis_tool: The base editing tool provides a set of modular functions, including both individual components and a wrapper pipeline, for analyzing base-editing screen data. It includes functionality for aligning FASTQ files to a reference to generate read counts, as well as methods for downstream analysis such as z-score calculation.

Base-editing screen analysis pipeline

A modular R-based pipeline for processing CRISPR gRNA base-editing screening data from raw FASTQ files through alignment, read counting, and statistical analysis (z-scores, p-values, and FDR), using control-guide log fold-changes as the null distribution.

This repository is designed to support reproducible, end-to-end analysis of pooled CRISPR base-editing screens, enabling standardized processing from sequencing reads to statistically inferred guide-level effects.

An accompanying .Rmd file is provided to demonstrate a full example workflow, allowing users to run the pipeline step-by-step on example data and reproduce the analysis from raw sequencing reads to final statistical outputs.

Overview

This pipeline performs:

Data Processing : fastq_to_readcounts_pipeline.R

FASTQ preprocessing (trimming, motif-based filtering)
gRNA reference index construction
Alignment to gRNA library
BAM sorting and indexing
Read counting per gRNA
Merging sample-level count matrices

Statistical Analysis : readcounts_to_pvalues.R

RPM normalization
Replicate averaging
Log2 transformation
Log fold-change calculation
Outlier filtering (IQR-based)
Z-score normalization (control-based)
p-value estimation
Benjamini–Hochberg FDR correction

Installation

Install required R packages:

install.packages(c(
  "tidyverse",
  "readxl",
  "openxlsx",
  "stringr",
  "data.table"
))

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install(c(
  "Rsubread",
  "Biostrings",
  "GenomicAlignments",
  "GenomicFeatures",
  "QuasR",
  "Rsamtools",
  "ShortRead"
))

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Amino_acid_predictions		Amino_acid_predictions
Demo_Fastq_files		Demo_Fastq_files
FASTA_reference_file		FASTA_reference_file
Analysis_Example.Rmd		Analysis_Example.Rmd
Analysis_Example.html		Analysis_Example.html
README.md		README.md
fastq_to_readcounts_pipeline.R		fastq_to_readcounts_pipeline.R
readcounts_to_pvalues.R		readcounts_to_pvalues.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Base-editing screen analysis pipeline

Overview

Data Processing : fastq_to_readcounts_pipeline.R

Statistical Analysis : readcounts_to_pvalues.R

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Base-editing screen analysis pipeline

Overview

Data Processing : fastq_to_readcounts_pipeline.R

Statistical Analysis : readcounts_to_pvalues.R

Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages