Skip to content

gimelbrantlab/fastq2allelictabs

Repository files navigation

Preprocessing protocol for alelle-specific expression on RNA-seq

Includes:

  1. pseudoreference preparation: pseudoreferences_creation/prepare_pseudoreference.py
  2. alignment: STAR
  3. alelic reads resolving: fastq_to_allelic_counts_tabs/alleleseparation.py
  4. counting reads per gene: featureCounts

See also: controlFreq, R-package for calculating overdispersion in RNA-seq samples, in presence of technical replication or spike-ins.

For an example wrapper function for steps (1-3) see fastq2allelicbams.sh; for stats collection (like # of raw reads, # of aligned reads, spike-in reads proportion) see fastq2allelicbams_stats.sh; for step (4) see allelicbams2genecounts.sh. See example directory for sample butch table example, and Wiki page for more details and usecases, motivation of pipeline choice, and QC.

pic

Note: step (1) is the same as in ASEReadCounter* (see Wiki), when (2-4) have been evolved.

Scheme is made in BioRender.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published