Paired-end RNAseq data processing workflow designed for execution on the BigPurple HPC. This version enables assembly of novel transcript isoforms. Primary software used: FastQC, Fastp, HISAT2, StringTie, MultiQC

public public 1yr ago 0 bookmarks

RNAseq_PE_HISAT2_stringtie_novel_transcripts

This readme describes how to execute the snake make workflow for paired-end RNA-seq pre-processing (fastq -> feature counting) utilizing HISAT2 and Stringtie for alignment and gene/transcript lev

Code Snippets

41
42
shell: 
	'fastqc {input.fastq} -o {params}'
58
59
shell:
	'fastp -w {threads} {params} -i {input.R1} -I {input.R2} -o {output.R1} -O {output.R2} --html {output.html} --json {output.json} 2> {log}'
72
73
74
75
76
77
	shell:
		'hisat2 {params} -p {threads} -x %s -1 {input.R1} -2 {input.R2} 2> {log} | samtools sort - -o alignment/{wildcards.sample}.bam -@ {threads}' % (genome)

rule count:
	input:
		bam = 'alignment/{sample}.bam'
84
85
86
87
88
89
90
91
	shell:
		'stringtie -p {threads} {params} -G %s -o {output.trans_counts} -l {wildcards.sample} -A {output.gene_counts} {input.bam}' % (GTF)

rule combine_gtf:
	input:
		trans_counts = expand('stringtie/{sample}.gtf' , sample = sample_ids)
	output:
		gtf_list = 'stringtie/merged.txt'
92
93
shell:
	'ls stringtie/*.gtf > {output.gtf_list}'
SnakeMake From line 92 of main/Snakefile
101
102
103
104
105
106
107
	shell:
		'stringtie --merge -p {threads} -G %s -o {output.merged_gtf} {input.gtf_list}' % (GTF)

rule count_2:
	input:
		bam = 'alignment/{sample}.bam',
		merged_gtf = 'stringtie/merged.gtf'
113
114
shell:
	'stringtie -p {threads} {params} -G  {input.merged_gtf} -o {output.trans_counts} {input.bam}'
123
124
shell:
	'prepDE.py -l 65 -i {input.str_dir} -g {output.gene_counts} -t {output.transcript_counts}'
SnakeMake From line 123 of main/Snakefile
ShowHide 4 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/mgildea87/RNAseq_PE_HISAT2_stringtie_novel_transcripts
Name: rnaseq_pe_hisat2_stringtie_novel_transcripts
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...