DiVA (DNA Variant Analysis) is a pipeline for Next-Generation Sequencing Exome data anlysis

public public 1yr ago Version: v2.0 0 bookmarks

This workflow performs mapping and variant calling following GATK Best Practices for Germline Variant Discovery. DiVA is part of the Snakemake-based pipelines collection solida-core developed and manteined at CRS4 .

www.crs4.it

Authors

  • Matteo Massidda (@massiddaMT)

  • Rossano Atzeni (@ratzeni)

Usage

The usage of this workflow is described in the Snakemake Workflow Catalog .

If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) repository and its DOI (see above).

INSTRUCTIONS

Create a virtual environment with the command:

mamba create -c bioconda -c conda-forge --name snakemake snakemake=6.15 snakedeploy

and activate it:

conda activate snakemake

We get some public data to test the pipeline. You can directly clone in this folder from github, just type:

git clone https://github.com/solida-core/test-data-DNA.git

You can then perform the pipeline deploy defining a directory my_dest_dir for analysis output and a pipeline tag for a specific version:

snakedeploy deploy-workflow https://github.com/solida-core/diva 
 my_desd_dir 
 --tag XXXX

To run the pipeline, go inside the deployed pipeline folder and use the command:

snakemake --use-conda -p --cores all

You can generate analysis report with the command:

snakemake --report report.zip --cores all

Code Snippets

19
20
21
22
23
24
25
26
27
28
29
30
shell:
    "gatk HaplotypeCaller --java-options {params.custom} "
    "-R {params.genome} "
    "-I {input.bam} "
    "-O {output.gvcf} "
    "-ERC GVCF "
    "-L {params.intervals} "
    "-ip 200 "
    "-G StandardAnnotation "
    "--max-reads-per-alignment-start 0 "
    "--min-base-quality-score 20 "
    "--add-output-vcf-command-line false "
13
14
15
16
17
18
shell:
    "vcftools "
    "--gzvcf {input} "
    "--out {params.out_basename} "
    "--relatedness2 "
    ">& {log}"
22
23
24
25
26
27
28
29
30
shell:
    "mkdir -p {params.base_db} ; "
    "gatk GenomicsDBImport --java-options {params.custom} "
    "{params.gvcfs} "
    "--genomicsdb-workspace-path {params.db} "
    "-L {params.intervals} "
    "-ip 200 "
    "--merge-input-intervals "
    ">& {log} "
49
50
51
52
53
shell:
    "gatk GenotypeGVCFs --java-options {params.custom} "
    "-R {params.genome} "
    "-V gendb://{params.db} "
    "-G StandardAnnotation "
43
44
45
46
47
48
49
50
51
52
shell:
    "gatk VariantRecalibrator --java-options {params.custom} "
    "-R {params.genome} "
    "-V {input.vcf} "
    "{params.recal} "
    "-tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 "
    "--output {output.recal} "
    "--tranches-file {output.tranches} "
    "--rscript-file {output.plotting} "
    ">& {log}"
72
73
74
75
76
77
78
shell:
    "gatk  ApplyVQSR --java-options {params.custom} "
    "-R {params.genome} "
    "-V {input.vcf} -mode {params.mode} "
    "--recal-file {input.recal} -ts-filter-level 99.0 "
    "--tranches-file {input.tranches} -O {output} "
    ">& {log}"
ShowHide 4 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/solida-core/diva
Name: diva
Version: v2.0
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...