Assemblosis: CWL based workflow to assemble haploid/diploid eukaryote genomes of non-model organisms

public public 1yr ago Version: Version 1 0 bookmarks

CWL based workflow to assemble haploid/diploid eukaryote genomes of non-model organisms

The workflow is designed to use both PacBio long-reads and Illumina short-reads. The workflow first extracts, corrects, trims and decontaminates the long reads. Decontaminated trimmed reads are then used to assemble the genome and raw reads are used to polish it. Next, Illumina reads are cleaned and used to further polish the resultant assembly. Finally, the polished assembly is masked using inferred repeats and haplotypes are eliminated. The workflow uses BioConda and DockerHub to install required software and is therefore fully automated. In addition to final assembly, the workflow produces intermediate assemblies before and after polishing steps. The workflow follows the syntax for CWL v1.0.

Software tools used in this pipeline

Code Snippets

43
44
baseCommand: ["/home/smrtpipe.sh"]
arguments: []
CWL From line 43 of Run/arrow.cwl
70
baseCommand: [bowtie2]
141
142
baseCommand: ["canu", "-trim-assemble"]
arguments: []
89
90
baseCommand: ["canu", "-correct"]
arguments: []
50
51
52
53
54
55
56
57
baseCommand: ["centrifuge","-f"]
arguments:
- valueFrom: $(inputs.database.path)/nt
  prefix: -x
  position: 1
hints:
  - class: DockerRequirement
    dockerPull: quay.io/biocontainers/centrifuge:1.0.3--py27pl5.22.0_3
40
41
baseCommand: ["/home/haploMerger.sh"]
arguments: []
CWL From line 40 of Run/haplomerger.cwl
37
baseCommand: ["python","/home/Assemblosis/Run/hdf5check/hdf5Check.py"]
CWL From line 37 of Run/hdf5check.cwl
61
62
63
64
65
66
67
baseCommand: ["BuildDatabase"]
arguments:
- -name
- valueFrom: $(inputs.scaffolds.basename)
hints:
  - class: DockerRequirement
    dockerPull: quay.io/biocontainers/repeatmodeler:1.0.11--pl526_1
50
51
52
arguments:
- valueFrom: ${var r = []; for (var i = 0; i < inputs.bamPe.length; i++) { r.push("--frags"); r.push(inputs.bamPe[i].path); } return r; }
  position: 2
CWL From line 50 of Run/pilon.cwl
79
80
81
82
83
84
85
baseCommand: ["RepeatMasker"]
arguments: ["-dir", $(runtime.outdir)]
hints:
  - class: DockerRequirement
    dockerPull: quay.io/biocontainers/repeatmasker:4.0.9_p2--pl526_2
    #dockerPull: quay.io/biocontainers/repeatmasker:4.0.7--pl526_13
    #dockerPull: quay.io/biocontainers/repeatmasker:4.0.6--pl5.22.0_10
16
17
18
arguments:
  - valueFrom: $(inputs.inputBamFile.basename).bai
    position: 2
CWL From line 16 of Run/samindex.cwl
79
80
81
82
83
84
85
86
87
88
89
90
arguments:
- valueFrom: pe1.$(inputs.reads1.nameroot).fastq.gz
  position: 5
- valueFrom: unpe1.$(inputs.reads1.nameroot).fastq.gz
  position: 6
- valueFrom: pe2.$(inputs.reads2.nameroot).fastq.gz
  position: 7
- valueFrom: unpe2.$(inputs.reads2.nameroot).fastq.gz
  position: 8
- valueFrom: trim.log
  prefix: -trimlog 
  position: 16
ShowHide 8 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/vetscience/Assemblosis.git
Name: assemblosis
Version: Version 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...