(S)train (A)ssignment from (M)etagen(O)me (S)NP (A)nalysis.

public public 1yr ago Version: 1.0 0 bookmarks

SAMOSA is a Snakemake computational workflow for performing strain level detection from Metagenomic samples. The pipeline is being extensively developed to detect mixed colonization (any species) from metagenome samples.

Workflow

Usage

snakemake -j 999 --cluster-config config/cluster.json --cluster "sbatch -A {cluster.account} -p {cluster.partition} -N {cluster.nodes} -t {cluster.walltime} -c {cluster.procs} --mem-per-cpu {cluster.pmem}" --use-conda

Code Snippets

15
16
shell: 
    "bowtie2-build {input} {params.basename}"
39
40
shell:
    "rgid=`echo {input.r1} | cut -d'/' -f3 | sed 's/_R1_001.fastq.gz//g'` && rgsm=`echo {input.r1} | cut -d'/' -f3 | sed 's/_R1_001.fastq.gz//g'` && bowtie2 -x {params.index} -1 {input.r1} -2 {input.r2} -U {input.r1_unpaired} -U {input.r2_unpaired} -S {output.samout} -t -p {params.threads} --non-deterministic --end-to-end --rg-id $rgid --rg SM:$rgsm --rg LB:1 --rg PL:Illumina && samtools view -Sb {output.samout} > {output.bamout} && samtools sort -O BAM -o {output.bamsortout} {output.bamout} &>{log}"
15
16
shell:
    "freebayes-parallel <(fasta_generate_regions.py {input.samtoolsreferenceindex} 100000) {params.threads} -f {input.reference} --min-alternate-count 2 --min-alternate-fraction 0 --ploidy 1 --pooled-continuous --report-monomorphic {input.bamdedup} > {output.rawvcf}"
11
12
shell:
    "picard MarkDuplicates REMOVE_DUPLICATES=true I={input.bamsortout} O={output.bamdedup} M={output.bamdedupmetrics} 2>{log}"
21
22
shell:
    "picard CreateSequenceDictionary R={input.reference} O={output.ref_seqdict}"
38
39
shell:
    "java -Xmx5G -jar ./bin/GenomeAnalysisTK.jar -T DepthOfCoverage -R {input.reference} -o {output.gatk_depth_summary} -I {input.bamdedup} --summaryCoverageThreshold 1 --summaryCoverageThreshold 2 --summaryCoverageThreshold 3 --summaryCoverageThreshold 4 --summaryCoverageThreshold 5 --summaryCoverageThreshold 6 --summaryCoverageThreshold 7 --summaryCoverageThreshold 8 --summaryCoverageThreshold 9 --summaryCoverageThreshold 10 --summaryCoverageThreshold 15 --summaryCoverageThreshold 20 --summaryCoverageThreshold 25 --ignoreDeletionSites 2>{log}"
54
55
shell:
    "java -Xmx5G -jar /nfs/esnitkin/bin_group/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar HaplotypeCaller -R {input.reference} -O {output.gatk_vcf} -I {input.bamdedup} -ploidy 1 --annotate-with-num-discovered-alleles true --annotation-group AlleleSpecificAnnotation --annotation AlleleFraction -ERC BP_RESOLUTION --bam-output {output.gatk_haplotypecaller_bam} 2>{log}"
SnakeMake From line 54 of rules/gatk.smk
70
71
shell:
    "java -Xmx5G -jar /nfs/esnitkin/bin_group/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar Mutect2 -R {input.reference} -O {output.gatk_mutect_vcf} -I {input.bamdedup} -af-of-alleles-not-in-resource 0.33 --annotation-group AlleleSpecificAnnotation --annotation AlleleFraction -ERC BP_RESOLUTION --bam-output {output.gatk_mutect_bam} 2>{log}"
SnakeMake From line 70 of rules/gatk.smk
85
86
shell:
    "java -Xmx5G -jar /nfs/esnitkin/bin_group/gatk-4.2.0.0/gatk-package-4.2.0.0-local.jar MergeVcfs --INPUT {input.rawvcf} --INPUT {input.gatk_vcf} --INPUT {input.gatk_mutect_vcf} --OUTPUT {output.mergedvcf} -R /scratch/esnitkin_root/esnitkin/apirani/Project_VRE_metagenomics_analysis/2021_04_09_VREfm_variant_calling/Reference_genome/Aus0004//Aus0004.fasta --CREATE_INDEX 2>{log}"
SnakeMake From line 85 of rules/gatk.smk
10
11
shell:
    "samtools index {input.bamdedup}"
20
21
shell:
    "samtools faidx {input}"
27
28
shell:
    "trimmomatic PE {input.r1} {input.r2} {output.r1} {output.r1_unpaired} {output.r2} {output.r2_unpaired} -threads {params.threads} ILLUMINACLIP:{params.adapter_filepath}:{params.seed}:{params.palindrome_clip}:{params.simple_clip}:{params.minadapterlength}:{params.keep_both_reads} SLIDINGWINDOW:{params.window_size}:{params.window_size_quality} MINLEN:{params.minlength} HEADCROP:{params.headcrop_length} &>{log}"
24
shell: 'rm -r results/*'
ShowHide 7 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://alipirani88.github.io/samosa
Name: samosa
Version: 1.0
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...