Fastq to bam mapping pipeline with marked duplicates, indexing, and insert size metric collection

public public 1yr ago 0 bookmarks

Fastq to bam mapping pipeline with marked duplicates, indexing, and insert size metric collection

This is a Snakemake-based pipeline for mapping with BWA mem. The pipeline automatically marks duplicates, indexes bams, and calculates insert size metrics, all of which are useful for downstream analysis. The Snakefile is derived from johanneskoester's example here: https://bitbucket.org/johanneskoester/snakemake-workflows/src/e03c8a9fb7f256ed498b0a5984e9ae738a8f22e1/bio/ngs/rules/mapping/bwa_mem.rules?at=master

Code Snippets

78
79
shell:
    "samtools flagstat {input[0]} > {output}"
87
88
shell:
    "samtools idxstats {input[0]} > {output}"
95
96
shell:
    """java -Xmx8G -jar $PICARD_DIR/CollectInsertSizeMetrics.jar I={input} O={output} H={output}.hist.pdf"""
103
shell: "samtools index {input}"
110
111
112
113
114
run:
    if len(input) > 1:
        shell("samtools merge -p -@ 8 {output} {input}")
    else:
        shell("rsync --bwlimit=50000 {input} {output}")
131
132
133
134
135
shell:
    """set -eo pipefail
       run-bwamem -t {params.bwa_threads} -dso $TMPDIR/tmp {input[0]} {input[1]} | bash
       rsync --bwlimit=50000 $TMPDIR/tmp.aln.bam {output}
       samtools index {output}"""
153
154
155
156
157
158
159
shell:
    """set -eo pipefail
    bwa mem {params.custom} -R '@RG\\tID:{params.flowcell}_{wildcards.lane}\\tSM:{params.sample}\\tLB:{params.sample}\\tPL:{config[platform]}\\tPU:{params.flowcell}' \
        -t {params.bwa_threads} {input} 2> {log} | \
    samblaster | \
    samtools sort -@ {params.samtools_threads} -m {params.samtools_memory} -O bam -T $TMPDIR/{wildcards.lane} -o {output}
    samtools index {output}"""
ShowHide 5 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/bnelsj/bwa_mem_mapping
Name: bwa_mem_mapping
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...