ZARP: An automated workflow for processing of RNA-seq data

public 1yr ago Version: Version 1 0 bookmarks

View Workflow

zarp-an-automated-workflow-for-processing-of-rna-s — View Workflow

Help improve this workflow!

This workflow has been published but could be further improved with some additional meta data:

Keyword(s) in categories input, output

You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .

ZARP ( Zavolan-Lab Automated RNA-Seq Pipeline)

...is a generic RNA-Seq analysis workflow that allows users to process and analyze Illumina short-read sequencing libraries with minimum effort. The workflow relies on publicly available bioinformatics tools and currently handles single or paired-end stranded bulk RNA-seq data. The workflow is developed in Snakemake , a widely used workflow management system in the bioinformatics community.

According to the current ZARP implementation, reads are analyzed (pre-processed, aligned, quantified) with state-of-the-art tools to give meaningful initial insights into the quality and composition of an RNA-Seq library, reducing hands-on time for bioinformaticians and giving experimentalists the possibility to rapidly assess their data. Additional reports summarise the results of the individual steps and provide useful visualisations.

Requirements

The workflow has been tested on:

CentOS 7.5
Debian 10
Ubuntu 16.04, 18.04

NOTE: Currently, we only support Linux execution.

Code Snippets

shell:
    "(cat {input.reads} > {output.reads}) \
    1> {log.stdout} 2> {log.stderr} "

SnakeMake From line 210 of workflow/Snakefile

shell:
    "(mkdir -p {output.outdir}; \
    fastqc --outdir {output.outdir} \
    --threads {threads} \
    {params.additional_params} \
    {input.reads}) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake FastQC From line 261 of workflow/Snakefile

shell:
    "(mkdir -p {output.outdir}; \
    fastqc --outdir {output.outdir} \
    --threads {threads} \
    {params.additional_params} \
    {input.reads}) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake FastQC From line 317 of workflow/Snakefile

shell:
    "(mkdir -p {params.output_dir}; \
    chmod -R 777 {params.output_dir}; \
    STAR \
    --runMode genomeGenerate \
    --sjdbOverhang {params.sjdbOverhang} \
    --genomeDir {params.output_dir} \
    --genomeFastaFiles {input.genome} \
    --runThreadN {threads} \
    --outFileNamePrefix {params.outFileNamePrefix} \
    --sjdbGTFfile {input.gtf}) \
    {params.additional_params} \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake STAR oschmod From line 388 of workflow/Snakefile

shell:
    "(sort \
    -k1,1 -k4,4n -k5,5nr {input.gtf} > {output.gtf} \
    ) 1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 434 of workflow/Snakefile

shell:
    "(gffread \
    -w {output.transcriptome} \
    -g {input.genome} \
    {params.additional_params} \
    {input.gtf}) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 481 of workflow/Snakefile

shell:
    "(cat {input.transcriptome} {input.genome} \
    1> {output.genome_transcriptome}) \
    2> {log.stderr}"

SnakeMake From line 523 of workflow/Snakefile

shell:
    "(salmon index \
    --transcripts {input.genome_transcriptome} \
    --decoys {input.chr_names} \
    --index {output.index} \
    --kmerLen {params.kmerLen} \
    --threads {threads}) \
    {params.additional_params} \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 583 of workflow/Snakefile

shell:
    "(mkdir -p {params.output_dir}; \
    chmod -R 777 {params.output_dir}; \
    kallisto index \
    {params.additional_params} \
    -i {output.index} \
    {input.transcriptome}) \
    1> {log.stdout}  2> {log.stderr}"

SnakeMake kallisto oschmod From line 626 of workflow/Snakefile

shell:
    "(gtf2bed12 \
    --gtf {input.gtf} \
    --bed12 {output.bed12}); \
    {params.additional_params} \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 673 of workflow/Snakefile

shell:
    "(samtools sort \
    -o {output.bam} \
    -@ {threads} \
    {params.additional_params} \
    {input.bam}) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 729 of workflow/Snakefile

shell:
    "(samtools index \
    {params.additional_params} \
    {input.bam} {output.bai};) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 786 of workflow/Snakefile

shell:
    "(calculate-tin.py \
    -i {input.bam} \
    -r {input.transcripts_bed12} \
    --names {params.sample} \
    -p {threads} \
    {params.additional_params} \
    > {output.TIN_score};) 2> {log.stderr}"

SnakeMake From line 859 of workflow/Snakefile

shell:
    "(salmon quantmerge \
    --quants {params.salmon_in} \
    --genes \
    --names {params.sample_name_list} \
    --column {params.salmon_merge_on} \
    --output {output.salmon_out};) \
    {params.additional_params} \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 945 of workflow/Snakefile

shell:
    "(salmon quantmerge \
    --quants {params.salmon_in} \
    --names {params.sample_name_list} \
    --column {params.salmon_merge_on} \
    --output {output.salmon_out}) \
    {params.additional_params} \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 1032 of workflow/Snakefile

shell:
    "(merge_kallisto.R \
    --input {params.tables} \
    --names {params.sample_name_list} \
    --txOut FALSE \
    --anno {input.gtf} \
    --output {params.dir_out} \
    {params.additional_params} ) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 1116 of workflow/Snakefile

shell:
    "(merge_kallisto.R \
    --input {params.tables} \
    --names {params.sample_name_list} \
    --output {params.dir_out} \
    {params.additional_params}) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 1197 of workflow/Snakefile

shell:
    "(zpca-tpm  \
    --tpm {input.tpm} \
    --out {output.out} \
    {params.additional_params}) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 1242 of workflow/Snakefile

shell:
    "(zpca-tpm  \
    --tpm {input.tpm} \
    --out {output.out} \
    {params.additional_params}) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 1284 of workflow/Snakefile

shell:
    "(mkdir -p {params.out_dir}; \
    chmod -R 777 {params.out_dir}; \
    STAR \
    --runMode inputAlignmentsFromBAM \
    --runThreadN {threads} \
    --inputBAMfile {input.bam} \
    --outWigType bedGraph \
    --outFileNamePrefix {params.prefix}) \
    {params.additional_params} \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake STAR oschmod From line 1395 of workflow/Snakefile

shell:
    "(cp {input.plus} {output.plus}; \
    cp {input.minus} {output.minus};) \
    1>{log.stdout} 2>{log.stderr}"

SnakeMake From line 1485 of workflow/Snakefile

shell:
    "(mkdir -p {output.temp_dir}; \
    alfa -a {input.gtf} \
    -g {params.genome_index} \
    --chr_len {input.chr_len} \
    --temp_dir {output.temp_dir} \
    -p {threads} \
    -o {params.out_dir} \
    {params.additional_params}) \
    &> {log}"

SnakeMake From line 1555 of workflow/Snakefile

shell:
    "(mkdir -p {output.temp_dir};\
    cd {params.out_dir}; \
    alfa \
    -g {params.genome_index} \
    --bedgraph {params.plus} {params.minus} {params.name} \
    -s {params.alfa_orientation} \
    --temp_dir {params.temp_dir} \
    {params.additional_params}) \
    &> {log}"

SnakeMake From line 1677 of workflow/Snakefile

shell:
    "(python {input.script} \
    --config {output.multiqc_config} \
    --intro-text '{params.multiqc_intro_text}' \
    --custom-logo '{params.logo_path}' \
    --url '{params.url}' \
    --author-name '{params.author_name}' \
    --author-email '{params.author_email}' \
    {params.additional_params}) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 1726 of workflow/Snakefile

shell:
    "(multiqc \
    --outdir {output.multiqc_report} \
    --config {input.multiqc_config} \
    {params.additional_params} \
    {params.results_dir} \
    {params.log_dir};) \
    1> {log.stdout} 2> {log.stderr}"

SnakeMake From line 1861 of workflow/Snakefile

shell:
    "(sortBed \
    -i {input.bg} \
    {params.additional_params} \
    > {output.sorted_bg};) 2> {log.stderr}"

SnakeMake From line 1916 of workflow/Snakefile

shell:
    "(bedGraphToBigWig \
    {params.additional_params} \
    {input.sorted_bg} \
    {input.chr_sizes} \
    {output.bigWig};) \
    1> {log.stdout} 2> {log.stderr}"