SVONT-Pipeline: Structural Variant Detection and Annotation from Oxford Nanopore Data using Snakemake

public public 1yr ago 0 bookmarks

SVONT-pipeline is pipeline for structural variant detection and annotation using Oxford Nanopore data . Oxford Nanopore Technologies provide technology for long reads sequencing. Pipeline is implemented by Snakemake and written in Snakefile. Snakemake is workflow manager based od python.

Files in following formats are used as input:

  • sequences in FAST5 format

  • sequences in FASTQ format zipped in .gz files

  • reference sequence in FASTA format

Annotated structural variants are summarized in tsv file which is the output of SVONT-pipeline.

Features

SVONT-pipeline perform following steps:

  • unzip of zipped FASTQ files

  • make one FASTQ file where are extracted all FASTQ files

  • transformation of fastq files to file in FASTA format

  • computing statistics from reads, visualization

  • mapping reads to reference sequence

  • detection of structural variant

  • annotation of detected variants

Dependencies

For execution SVONT-pipeline are reqiured following packages:
minimap2 , nanopolish , pysam , samtools , snakemake , sniffles2 , NanoPlot , NanoStat , AnnotSV , gzip , script fastq_to_fasta.py

Packages nanopolish, minimap2, samtools, pysam, sniffles=2.0, snakemake, gzip, NanoPlot and NanoStat can be install using Conda:

conda install -c bioconda nanopolish minimap2 samtools pysam sniffles=2.0 snakemake NanoPlot NanoStat
conda install -c conda-forge gzip

A tool AnnotSV can´t be installed using Conda, it need to be clone from github repository:

git clone https://github.com/lgmgeo/AnnotSV.git
make install

The python script fastq_to_fasta.py can be download from this repository.

How to run SVONT-pipeline

Configuration file

SVONT-pipeline uses a configuration file in format YAML which has to contain folowing variables:

run: name_of_the_run
fast5Dir: path_to_fast5_directory
ref: path_to_reference_fasta_file (index file should also be present in the same folder)
fastqDir: path_to_fastq_directory
AnnotSV: path_to_AnnotSV_directory_which_was_installed

Folder structure

To run SVON-pipeline user need to have these folders in this structure:

 | ├── data/ | ├── example_input/ | └── ref/ └── src/ ├── Snakefile ├── scripts/ | └── fastq_to_fasta.py └── config/ └── example_config.yaml

Pipeline execution

Run $ snakemake --configfile config/example_config.yaml -c1 in the src folder. A successful run will create a run directory in the data folder.

Output

The output comprises the following files and directories in the data folder:

  • AnnotSV.log

  • reads.fasta

  • reads.fasta.index

  • reads.fasta.index.fai

  • reads.fasta.index.gzi

  • reads.fasta.index.readdb

  • reads.vcf

  • reads_fastq_all.fastq

  • reads-ref.sorted.bam

  • reads-ref.sorted.bam.bai

  • stats

  • directory annotation

  • directory fastq

  • directory graphs

  • directory log

Code Snippets

5
6
shell:
    "echo {input}"
16
17
18
19
20
21
shell:
    """
    export ANNOTSV={input.AnnotSV}
    {input.AnnotSV}/bin/AnnotSV -SVinputFile {input.vcf} \
    -annotationMode split -genomeBuild GRCh37 -outputDir {output} >& {log} &
    """
28
29
shell:
    "sniffles --input {input.sorted} --vcf {output}"
38
39
shell:
    "minimap2 -ax map-ont -t 8 {input.ref} {input.fasta} | samtools sort -o {output} -T reads.tmp; samtools index {output}"
48
49
shell:
    "nanopolish index -d {input.fast5Dir} {input.fasta} > {log}"
57
58
shell:
    "NanoPlot -o {output} --color green --format jpg --title {wildcards.run} --fasta {input.fasta}"
65
66
shell:
    "NanoStat -n {output} --fastq {input}"
73
74
shell:
    "cat {input.fastqDir}/* > {output}"
SnakeMake From line 73 of main/Snakefile
81
82
script:
    "scripts/fastq_to_fasta.py"
SnakeMake From line 81 of main/Snakefile
89
90
91
92
93
shell:
    """
    cp -r {input.fastqDir} {output}
    gzip -d {output}/*
    """
SnakeMake From line 89 of main/Snakefile
ShowHide 4 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/PaMrhac/SVONT-pipeline
Name: svont-pipeline
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...