Includes:
- pseudoreference preparation:
pseudoreferences_creation/prepare_pseudoreference.py
- alignment:
STAR
- alelic reads resolving:
fastq_to_allelic_counts_tabs/alleleseparation.py
- counting reads per gene:
featureCounts
See also: controlFreq, R-package for calculating overdispersion in RNA-seq samples, in presence of technical replication or spike-ins.
For an example wrapper function for steps (1-3) see fastq2allelicbams.sh
; for stats collection (like # of raw reads, # of aligned reads, spike-in reads proportion) see fastq2allelicbams_stats.sh
; for step (4) see allelicbams2genecounts.sh
. See example
directory for sample butch table example, and Wiki page for more details and usecases, motivation of pipeline choice, and QC.
Note: step (1) is the same as in ASEReadCounter* (see Wiki), when (2-4) have been evolved.
Scheme is made in BioRender.