Pipeline for RNA-seq analysis

public 1yr ago 0 bookmarks

View Workflow

Help improve this workflow!

This workflow has been published but could be further improved with some additional meta data:

Keyword(s) in categories input, output, operation, topic
Lack of a description for a new keyword DESeq2 .
Lack of a description for a new keyword SortMeRNA .

You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .

This is a simple utility to analyse RNA-seq data with a selection of popular tools. It is written in snakemake that allows the user to scale it to any system, including clusters and cloud computing architectures. Currently, it supports analyses with the following tools;

STAR (genome-based mapping)
Salmon (transcript-based mapping)
Kallisto (transcript-based mapping)
SortMeRNA (rRNA filtering)
StringTie (quantification of BAM files)
NCBI-tools (downloading and conversion from NCBI server)
DESeq2 (differential expression)
MultiQC (quality control report)
[FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc

(quality control for reads)

Trinity (de-novo transcript assembly)
bowtie2 (de-novo alignment)

In short, given a public run ID from the SRA archives (SRRXXXXXXX), the pipeline can download the raw data from the NCBI server, convert the files to fastq format, filter the reads, generate indices, map to genome/transcript, and quantify the mapped reads. In addition, it can also carry out differential expression analysis.

However, the user can provide their own FASTA or SRA files for the analysis. In order to modify the default options, refer to the appropriate rules in the respective rules/filename.smk files. Please read wiki for more details.

cardio

Dependencies

python3+ (tested on Python3.7 and 3.8)
snakemake (workflow management)
pandas
R 4+ (for DESeq2 analysis; tested on R 4.0.2)
tximport , DESeq2 , AnnotationDbi , Rsubread (for R related analyses)

All the above-mentioned dependencies have been successfully tested on Ubuntu 18.04.5 , Gentoo 5.4.48, Fedora 32.

Installation

For installation purposes, it is a good idea to make a new python virtual environment and install snakemake on it, and then download all the tool binaries in your desired location. Following this, retrieve the repository using git ;

git clone https://github.com/rohitsuratekar/CardioPipeLine.git

Alternatively, download the repository in the desired location.

The only other modification that needs to be done is to edit the config/config.yaml and config/samples.csv files to specify the location of files on your system, and then run following in the repository folder

snakemake --core all

The above mentioned command will essentially carry out all the steps for you, from downloading the file to quantification, while you sit back and relax :)

Tasks

This pipeline carries out tasks. However, you can always select which tasks you want :)

Task No	Details	Tool Used	Input #	Output #, *
0	All tasks	--	--	--
1	Download `.sra` file	`prefetch`	samples.csv	srr.sra
2	Convert to `fastq`	`fasterq-dump`	srr.sra	srr.fastq
3	rRNA filtering	`sortmerna`	srr.fastq	srr.filtered.fastq
4	STAR indexing	`star`	genome.fa, anno.gtf	star-idx
5	STAR mapping	`star`	star-idx, srr.filtered.fastq / srr.fastq	srr.bam
6	Quantification	`stringtie`	srr.bam, anno.gtf	st.tsv
7	Salmon indexing	`salmon`	anno.gtf	salmon-idx
8	Salmon mapping	`salmon`	salmon-idx, srr.filtered.fastq / srr.fastq	quants.sf
9	Kallisto indexing	`kallisto`	anno.gtf	kallisto-idx
10	Kallisto mapping	`kallisto`	kallisto-idx, srr.filtered.fastq / srr.fastq	abundance.tsv
11	Count Matrix StringTie	`prepDE.py`	st.tsv	stringtie.counts
12	Count Matrix Star	`Rsubread`	ssr.bam	star.counts
13	Count Matrix Salmon	`tximport`	quants.sf	salmon.counts
14	Count Matrix Kallisto	`tximport`	abundance.tsv	kallisto.counts
15	Differential Analysis	`DESeq2`	*.counts	con1_vs_con2.csv
16	Quality Control Report	`fastqc`	--	srr.html
17	Quality Control Report	`multiqc`	--	quality_report.html
18	de-novo Assembly	`trinity`	srr.fastq	trinity.fasta
19	de-novo Alignment	`bowtie2`	trinity.fasta	stat.txt

* There might be many other related outputs. # Names are for representative purpose. Actual names will be different

Selection of tasks

You have full control of which tasks to run. Your can run desired tasks by editing tasks parameter in the config.yaml file. If you want to perform specific task, you need to provide its input file in appropriate location.

But do not worry, thanks to snakemake , if you did not provide appropriate input files, it will search far task which outputs these files and will automatically run for you.

By default tasks: 0 is set. This will run all available tasks.

Setting up the workflow

Editing config.yaml and samples.csv is probably the most important aspect of this pipeline. These are files which will change for each user according to their system and work.

config.yaml

If you are fine with default options of tools used in this workflow, you can just edit this file and run the pipeline with snakemake . Just read the comment in config/config.yaml file and you will know what to do :) Comments are self explanatory.

Config file has parameter called base which provides base folder for all the analysis. Every output file crated with this pipeline can be found in this base folder. Overall output file structure is shown below. More detailed file structure can be found here .

base
├── sra
├── fastq
├── filtered
├── bams
├── index
├── mappings
├── deseq2
└── logs

samples.csv

This file will contain names of all the samples which you want to analyse with this pipeline. You should provide SRA Runs IDs (usually in SRRXXXXXXX format ). You can find these IDs for publically available datasets from GEO Dataset website. If you have your own fastq files, you can check naming convention and rename those files and keep in appropriate folder. and start the pipeline.

This .csv (comma delimited) file SHOULD have header row and SHOULD contains at least following headers.

run : Run ID
is_paired : true (if you are handling paired end sample), false (if it is single end sample)

Example file content

run,is_paired
SRR0000001,true
SRR0000002,false

If you are performing any DESeq2 analysis, third column specifying conditions is mandatory

run,is_paired,condition
SRR0000001,true,wt
SRR0000002,false,mt

Here third column condition will be used in DESeq2 analysis. You should update appropriate parameters in config.yaml file. For example, Let us assume following sample file

run,is_paired,time
SRR0000001,true,48
SRR0000002,false,24

In above file, as third column name is time and 24 is your reference condition then you should update following in the config.yaml

deseq2:
 design_column: "time"
 reference: "24"

In addition, you should also update design parameter from config.yaml file. For example, in above situation, it can be design: "~ time"

Code Snippets

script: "../scripts/bam_to_counts.R"

SnakeMake From line 21 of rules/counts.smk

shell:
     """
     echo "{wildcards.SRR_ID} {wildcards.BASE_FOLDER}/mappings/stringtie/{wildcards.SRR_ID}/{wildcards.SRR_ID}_assembled.gtf" > input.temp

     mv input.temp {wildcards.BASE_FOLDER}/mappings/stringtie/{wildcards.SRR_ID}/{wildcards.SRR_ID}_input.temp

     {input.binary} -i {wildcards.BASE_FOLDER}/mappings/stringtie/{wildcards.SRR_ID}/{wildcards.SRR_ID}_input.temp -g {output.gmat} -t {output.tmat}

     rm -f {wildcards.BASE_FOLDER}/mappings/stringtie/{wildcards.SRR_ID}/{wildcards.SRR_ID}_input.temp
     """

SnakeMake From line 37 of rules/counts.smk

script: "../scripts/txi_to_counts.R"

SnakeMake From line 59 of rules/counts.smk

script: "../scripts/txi_to_counts.R"

SnakeMake From line 70 of rules/counts.smk

run:
    generate_deseq_input(wildcards)

SnakeMake From line 41 of rules/deseq2.smk

script: "../scripts/deseq2.R"

SnakeMake DESeq2 From line 56 of rules/deseq2.smk

shell:
     """
     {input.fastqc} -t {threads} -o {wildcards.BASE_FOLDER}/quality/{wildcards.SRR_ID} {input.f1} {input.f2}
     """

SnakeMake From line 21 of rules/fastqc.smk

shell:
     """
     {input.fastqc} -t {threads} -o {wildcards.BASE_FOLDER}/quality/{wildcards.SRR_ID} {input.f1}
     """

SnakeMake From line 33 of rules/fastqc.smk

shell:
     """
     {input.fastqc} -t {threads} -o {wildcards.BASE_FOLDER}/quality/{wildcards.SRR_ID} {input.f1} {input.f2}
     """

SnakeMake From line 47 of rules/fastqc.smk

shell:
     """
     {input.fastqc} -t {threads} -o {wildcards.BASE_FOLDER}/quality/{wildcards.SRR_ID} {input.f1}
     """

SnakeMake From line 59 of rules/fastqc.smk

    shell:
         "{input.kallisto} index "  # Kallisto index mode
         "-i {output} "  # Filename (not folder) for kallisto
         "{input.transcript}"  # Transcript


def make_kallisto_reads(files):
    if len(files) == 2:
        # Paired end reads
        return " ".join(files)
    else:
        # Single end, use parameters from config file
        frg_length = config["extra"]["kallisto-fragment-length"]  # Fragment
        # length
        frg_sd = config["extra"]["kallisto-standard-deviation"]  # Standard
        # deviation
        return f"--single -l {frg_length} -s {frg_sd} {files}"

SnakeMake kallisto From line 16 of rules/kallisto.smk

shell:
     """
     multiqc {wildcards.BASE} -f -n {wildcards.BASE}/quality_report.html 

     touch {wildcards.BASE}/quality.html
     """

SnakeMake MultiQC From line 13 of rules/multiqc.smk

    shell:
         "mkdir -p {wildcards.BASE_FOLDER}/sra && "  # Make folder if not present
         "{params.prefetch} -p {wildcards.SSR_ID} "  # Prefetch main command
         "--output-file {output} "  # Write to specific file

rule get_fastq_paired:
    input: ancient("{BASE_FOLDER}/sra/{SRR_ID}.sra")
    threads: config["threads"]
    output:"{BASE_FOLDER}/fastq/{SRR_ID}.sra_1.fastq",
          "{BASE_FOLDER}/fastq/{SRR_ID}.sra_2.fastq"

SnakeMake SRA From line 24 of rules/ncbi.smk

shell:
     "{params.fasterq} "  # Binary
     "{input} "  # Input SRA File
     "--split-files "  # Split files according to the reads
     "--progress "  # Show progress
     "--skip-technical "  # Skip technical reads
     "--threads {threads} "  # Number of threads
     "--outdir {wildcards.BASE_FOLDER}/fastq"

SnakeMake From line 37 of rules/ncbi.smk

shell:
     "{params.fasterq} "  # Binary
     "{input} "  # Input SRA File
     "--split-files "  # Split files according to the reads
     "--progress "  # Show progress
     "--skip-technical "  # Skip technical reads
     "--threads {threads} "  # Number of threads
     "--outdir {wildcards.BASE_FOLDER}/fastq"

SnakeMake From line 52 of rules/ncbi.smk

    shell:
         "scripts/generateDecoyTranscriptome.sh "
         "-j {threads} "  # Number of threads
         "-a {input.gtf} "  # Genome GTF File
         "-g {input.fasta} "  # Genome in fasta
         "-t {input.transcript} "  # Transcript file in fasta
         "-o {wildcards.BASE}/index/salmon "  # Salmon index output folder
         "-b {input.bedtools} "  # Path to bedtools
         "-m {input.mashmap} "  # Path to mashmap

# Generate index for salmon
rule index_salmon:
    input:
         decoy_index=ancient("{BASE}/index/salmon/gentrome.fa"),
         decoy_keys=ancient("{BASE}/index/salmon/decoys.txt"),
         salmon=config["tools"]["salmon"]
    threads: config["threads"]
    output: "{BASE}/index/salmon/pos.bin"

SnakeMake BEDTools Salmon MashMap From line 22 of rules/salmon.smk

    shell:
         "{input.salmon} index "  # Salmon index mode
         "-t {input.decoy_index} "  # Decoy transcript
         "-i {wildcards.BASE}/index/salmon "  # Path to index folder
         "--decoys {input.decoy_keys} "  # Decoy keys
         "-k 31 "  # K-mer (31) is recommended by the authors
         "-p {threads}"  # Number of threads


def get_salmon_reads(files):
    if isinstance(files, str):
        return f"-r {files}"

SnakeMake Salmon From line 40 of rules/salmon.smk

shell:
     """
     rm -rf {params.index_dir}/kvdb

     {params.sortmerna} {params.ref_str} {params.read_str} -workdir {params.index_dir} -fastx -other -a {threads} -num_alignments 1 -paired_in --out2 

     mkdir -p {BASE}/filtered/ 

     mkdir -p {BASE}/logs

     mv {params.index_dir}/out/aligned.log {BASE}/logs/{wildcards.SRR_ID}_filtering.log

     mv {params.index_dir}/out/other_fwd.fastq {BASE}/filtered/{wildcards.SRR_ID}.sra.filtered_1.fastq
     mv {params.index_dir}/out/other_rev.fastq {BASE}/filtered/{wildcards.SRR_ID}.sra.filtered_2.fastq

     rm  -f {params.index_dir}/out/aligned_fwd.fastq
     rm  -f {params.index_dir}/out/aligned_rev.fastq
     """

SnakeMake From line 54 of rules/sortmerna.smk

shell:
     """
     rm -rf {params.index_dir}/kvdb

     {params.sortmerna} {params.ref_str}  -reads {input.reads} -workdir {params.index_dir} -fastx -other -a {threads} -num_alignments 1 -paired_in 

     mkdir -p {BASE}/filtered/ 

     mkdir -p {BASE}/logs

     mv {params.index_dir}/out/aligned.log {BASE}/logs/{wildcards.SRR_ID}_filtering.log

     mv {params.index_dir}/out/other.fastq {BASE}/filtered/{wildcards.SRR_ID}.sra.filtered.fastq 

     rm  -f {params.index_dir}/out/aligned.fastq
     """

SnakeMake From line 86 of rules/sortmerna.smk

shell:
     """
     {input.star} --runMode genomeGenerate --runThreadN {threads} --genomeDir {params.folder} --genomeFastaFiles {input.genome} --sjdbGTFfile {input.gtf}

    if [ -f Log.out ]; then
        mkdir -p {BASE}/logs
        mv Log.out {BASE}/logs/STAR.index.log
    fi 
     """

SnakeMake From line 29 of rules/star.smk

shell:
     """
     {input.star} --runThreadN {threads} --genomeDir {BASE}/index/star --outFileNamePrefix {params.prefix} --outSAMtype BAM SortedByCoordinate --readFilesIn {input.files}

     rm -f {BASE}/bams/{params.prefix}.Log.progress.out

     """

SnakeMake From line 60 of rules/star.smk

    shell:
         """ 
         export PATH="$(dirname {input.samtools}):$PATH"
         export PATH="$(dirname {input.jellyfish}):$PATH"
         export PATH="$(dirname {input.bowtie2}):$PATH"
         export PATH="$(dirname {input.salmon}):$PATH"

         {input.trinity} --seqType fq --left {input.files[0]} --right {input.files[1]} --CPU {threads} --output {wildcards.BASE_FOLDER}/mappings/trinity/trinity_{wildcards.SRR_ID} --max_memory 50G --verbose
         """

"""
Build BowTie indexs

bowtie2-build assembled transcript output_prefix
"""

SnakeMake Bowtie 2 From line 29 of rules/trinity.smk

    shell:
         """
         {input.bowtie2} {input.transcripts} {input.transcripts}
         """

"""
Get statistics from paired-end
Bowtie2:
-p : number of threads
-q : input files are in fastq format
--no-unal: suppress SAM records for unaligned reads
-x: Index files prefix (without X.bt2 extension)

"""

SnakeMake From line 51 of rules/trinity.smk

shell:
     """
     {input.bowtie} -p {threads} -q --no-unal -x {input.transcripts} -1 {input.files[0]} -2 {input.files[1]} > {output}
     """

SnakeMake From line 74 of rules/trinity.smk

suppressPackageStartupMessages(library("data.table"))
library("Rsubread")


fc <- Rsubread::featureCounts(
  files = snakemake@input$bam,
  annot.ext = snakemake@input$gtf,
  isGTFAnnotationFile = TRUE,
  isPairedEnd = as.logical(snakemake@params$is_paired),
  nthreads = snakemake@threads)

colnames(fc$counts) <- snakemake@wildcards$SRR_ID
df <- data.frame(fc$counts)
data.table::setDT(df, keep.rownames = "gene_id")

fname <- paste(snakemake@wildcards$BASE, "deseq2", "counts", snakemake@wildcards$SRR_ID, sep = "/")
fname <- paste0(fname, "/", snakemake@wildcards$SRR_ID, ".star.counts")
write.csv(df, file = fname, row.names = FALSE)
write.csv(fc$stat, file = paste0(fname, ".stat"), row.names = FALSE)

R data.table DESeq2 Rsubread From line 6 of scripts/bam_to_counts.R

start_time <- Sys.time()

suppressPackageStartupMessages(library("DESeq2"))

# We can relevel our conditions initially so that we do not have to worry
# about them later on
conds <- factor(snakemake@params$conditions)
conds <- relevel(conds, snakemake@params$reference)

colData <- data.frame(sample = snakemake@params$samples,
                      condition = conds)

df <- read.csv(snakemake@input$combined)
rownames(df) <- df$gene_id
df$gene_id <- NULL

dds <- DESeq2::DESeqDataSetFromMatrix(countData = df,
                                      colData = colData,
                                      design = formula
                                      (snakemake@params$design))


# Start the DE analysis
dds <- DESeq2::DESeq(dds)

outputs <- NULL

# Get results from each combination of conditions
for (g in unique(snakemake@params$conditions)) {
  if (g != snakemake@params$reference) {
    contrast <- c(snakemake@params$column, g, snakemake@params$reference)
    res <- results(dds, contrast = contrast)
    fn <- paste(snakemake@wildcards$BASE, "deseq2", "analysis",
                snakemake@wildcards$METHOD, snakemake@wildcards$METHOD,
                sep = "/")
    name <- paste(fn, g, "vs", snakemake@params$reference, sep = "_")
    name <- paste0(name, ".csv")
    df <- as.data.frame(res)
    data.table::setDT(df, keep.rownames = "gene_id")
    write.csv(df, file = name, row.names = FALSE)
    outputs <- c(outputs, name)
  }
}

# Generate file for snakamake by renaming the input file
summary <- paste(snakemake@wildcards$BASE, "deseq2", "analysis",
                 snakemake@wildcards$METHOD, "analysis.log",
                 sep = "/")


fileConn <- file(summary)
fc <- "Analysis \t: DESeq2"
fc <- c(fc, paste0("Start Time \t: ", start_time))
fc <- c(fc, paste0("Finish Time \t: ", Sys.time()))
fc <- c(fc, paste0("Samples \t: ", paste0(snakemake@params$samples, collapse
  = ",")))
fc <- c(fc, paste0("Conditions \t: ", paste0(snakemake@params$conditions,
                                             collapse = ",")))
fc <- c(fc, paste0("Design Formula \t: ", snakemake@params$design))
fc <- c(fc, paste0("Reference \t: ", snakemake@params$reference))
fc <- c(fc, paste0("Outputs \t: ", paste(outputs, sep = "")))
writeLines(fc, fileConn)
close(fileConn)

R DESeq2 From line 6 of scripts/deseq2.R

abort()
{
    echo >&2 '
***************
*** ABORTED ***
***************
'
    echo "An error occurred. Exiting..." >&2
    exit 1
}

trap 'abort' 0

set -e

###############################################
## It assumes awk, bedtools and mashmap is
## available.
## We have tested this script with
## awk 4.1.3, bedtools v2.28.0 and mashmap v2.0
## on an Ubuntu system.
###############################################

threads=1
awk="awk"
bedtools="bedtools"
mashmap="mashmap"

# Argument Parsing
print_usage_and_exit () {
    echo "Usage: $0 [-j <N> =1 default] [-b <bedtools binary path> =bedtools default] [-m <mashmap binary path> =mashmap default] -a <gtf file> -g <genome fasta> -t <txome fasta> -o <output path>"
    exit 1
}

echo "****************"
echo "*** getDecoy ***"
echo "****************"
while getopts ":a:b:o:j:h:g:t:m:" opt; do
    case $opt in
        b)
            bedtools=`realpath $OPTARG`
            echo "-b <bedtools binary> = $bedtools"
            ;;
        m)
            mashmap=`realpath $OPTARG`
            echo "-m <mashmap binary> = $mashmap"
            ;;
        a)
            gtffile=`realpath $OPTARG`
            echo "-a <Annotation GTF file> = $gtffile"
            ;;
        o)
            outfolder="$OPTARG"
            echo "-o <Output files Path> = $outfolder"
            ;;
        j)
            threads="$OPTARG"
            echo "-j <Concurrency level> = $threads"
            ;;
        g)
            genomefile=`realpath $OPTARG`
            echo "-g <Genome fasta> = $genomefile"
            ;;
        t)
            txpfile=`realpath $OPTARG`
            echo "-t <Transcriptome fasta> = $txpfile"
            ;;
        h)
            print_usage_and_exit
            ;;
        \?)
            echo "Invalid option: -$OPTARG"
            print_usage_and_exit
            ;;
        :)
            echo "Option -$OPTARG requires an argument."
            print_usage_and_exit
            ;;
    esac
done

# Required arguments
if [ -z "$gtffile" -o -z "$outfolder" -o -z "$genomefile" -o -z "$txpfile" -o -z "$mashmap" -o -z "$awk" -o -z "$bedtools" -o -z "$threads" ]
then
    echo "Error: missing required argument(s)"
    print_usage_and_exit
fi

mkdir -p $outfolder
cd $outfolder

# extracting all the exonic features to mask
echo "[1/10] Extracting exonic features from the gtf"
$awk -v OFS='\t' '{if ($3=="exon") {print $1,$4,$5}}' $gtffile > exons.bed

# masking the exonic regions from the genome
echo "[2/10] Masking the genome fasta"
$bedtools maskfasta -fi $genomefile -bed exons.bed -fo reference.masked.genome.fa

# aligning the transcriptome to the masked genome
echo "[3/10] Aligning transcriptome to genome"
$mashmap -r reference.masked.genome.fa -q $txpfile -t $threads --pi 80 -s 500

# extracting the bed files from the reported alignment
echo "[4/10] Extracting intervals from mashmap alignments"
$awk -v OFS='\t' '{print $6,$8,$9}' mashmap.out | sort -k1,1 -k2,2n - > genome_found.sorted.bed

# merging the reported intervals
echo "[5/10] Merging the intervals"
$bedtools merge -i genome_found.sorted.bed > genome_found_merged.bed

# extracting relevant sequence from the genome
echo "[6/10] Extracting sequences from the genome"
$bedtools getfasta -fi reference.masked.genome.fa -bed genome_found_merged.bed -fo genome_found.fa

# concatenating the sequence at per chromsome level to extract decoy sequences
echo "[7/10] Concatenating to get decoy sequences"
$awk '{a=$0; getline;split(a, b, ":");  r[b[1]] = r[b[1]]""$0} END { for (k in r) { print k"\n"r[k] } }' genome_found.fa > decoy.fa

# concatenating decoys to transcriptome
echo "[8/10] Making gentrome"
cat $txpfile decoy.fa > gentrome.fa

# extracting the names of the decoys
echo "[9/10] Extracting decoy sequence ids"
grep ">" decoy.fa | $awk '{print substr($1,2); }' > decoys.txt

# removing extra files
echo "[10/10] Removing temporary files"
rm exons.bed reference.masked.genome.fa mashmap.out genome_found.sorted.bed genome_found_merged.bed genome_found.fa decoy.fa reference.masked.genome.fa.fai

trap : 0
echo >&2 '
**********************************************
*** DONE Processing ...
*** You can use files `$outfolder/gentrome.fa`
*** and $outfolder/decoys.txt` with
*** `salmon index`
**********************************************
'

Shell BEDTools MashMap From line 4 of scripts/generateDecoyTranscriptome.sh

library("tximport")
suppressPackageStartupMessages(library("AnnotationDbi"))

tx <- GenomicFeatures::makeTxDbFromGFF(file = snakemake@input$gtf,
                                       format = "gtf",
                                       organism = snakemake@params$animal)

k <- AnnotationDbi::keys(tx, keytype = "GENEID")
tx_df <- AnnotationDbi::select(tx,
                               keys = k,
                               keytype = "GENEID",
                               columns = "TXNAME")
tx2gene <- tx_df[, 2:1]

txi <- tximport::tximport(snakemake@input$file,
                          type = snakemake@params$method,
                          tx2gene = tx2gene,
                          ignoreTxVersion = TRUE)

colnames(txi$counts) <- snakemake@wildcards$SRR_ID
# Tximport will generate fractions. We need to convert them to integers by
# just rounding them. This is what the original class does
# https://github.com/mikelove/DESeq2/blob/master/R/AllClasses.R
txi$counts <- round(txi$counts)
df <- data.frame(txi$counts)
data.table::setDT(df, keep.rownames = "gene_id")

fname <- paste(snakemake@wildcards$BASE, "deseq2", "counts",
               snakemake@wildcards$SRR_ID, sep = "/")
fname <- paste0(fname, "/", snakemake@wildcards$SRR_ID, ".",
                snakemake@params$method, ".counts")
write.csv(df, file = fname, row.names = FALSE)