Integrated Workflow: Metagenomic Bins to Metabolic Models (GEMs)

public public 1yr ago Version: Version 1 0 bookmarks

Workflow for Metagenomics from bins to metabolic models (GEMs)

  • Prodigal gene prediction
  • CarveMe genome scale metabolic model reconstruction
  • MEMOTE for metabolic model testing
  • SMETANA Species METabolic interaction ANAlysis

Code Snippets

16
17
18
19
baseCommand: [pigz, -c]

arguments:
  - valueFrom: $(inputs.inputfile)
CWL From line 16 of bash/pigz.cwl
23
baseCommand: [ bbduk.sh ]
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
baseCommand: [bbmap.sh]

arguments:
  - "-Xmx$(inputs.memory)M"
  - "printunmappedcount"
  - "overwrite=true"
  - "bloom=t"
  - "statsfile=$(inputs.identifier)_BBMap_stats.txt"
  - "covstats=$(inputs.identifier)_BBMap_covstats.txt"
  - |
    ${
      if (inputs.output_mapped){
        return 'outm1='+inputs.identifier+'_filtered_1.fq.gz \
                outm2='+inputs.identifier+'_filtered_2.fq.gz';
      } else {
        return 'outu1='+inputs.identifier+'_filtered_1.fq.gz \
                outu2='+inputs.identifier+'_filtered_2.fq.gz';
      }
    }
  # - "fast"
  # - "minratio=0.9"
  # - "maxindel=3"
  # - "bwr=0.16"
  # - "bw=12"
  # - "minhits=2"
  # - "qtrim=r"
  # - "trimq=10"
  # - "untrim"
  # - "idtag"
  # - "kfilter=25"
  # - "maxsites=1"
  # - "k=14"
  # - "nodisk=t"
  # - "out=$(inputs.identifier)_BBMap.sam"
  # - "rpkm=$(inputs.identifier).rpkm"
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
- entryname: script.sh
  entry: |-
    #!/bin/bash
    echo -e "\
    #/usr/bin/python3
    import sys\n\
    headers = set()\n\
    c = 0\n\
    for line in sys.stdin:\n\
      splitline = line.split()\n\
      if line[0] == '>':    \n\
        if splitline[0] in headers:\n\
          c += 1\n\
          print(splitline[0]+'.x'+str(c)+' '+' '.join(splitline[1:]))\n\
        else:\n\
          print(line.strip())\n\
        headers.add(splitline[0])\n\
      else:\n\
        print(line.strip())" > ./dup.py
    out_name=$1
    shift

    if file $@ | grep gzip; then
      zcat $@ | python3 ./dup.py | gzip > $out_name
    else
      cat $@ | python3 ./dup.py | gzip > $out_name
    fi
25
baseCommand: [ busco ]
26
baseCommand: [ carve ]
20
21
22
23
24
25
26
27
28
29
- entryname: script.sh
  entry: |-
    #!/bin/bash
    identifier=$1
    shift;
    echo "Model Mets Reactions Genes" > $identifier\_CarveMe_GEMstats.tsv
    for file in "$@"
    do
      bash /unlock/infrastructure/scripts/GEMstats.sh $file
    done >> $identifier\_CarveMe_GEMstats.tsv
CWL From line 20 of carveme/GEMstats.cwl
34
35
36
37
38
- entryname: script.sh
  entry: |-
    # !/bin/bash
    export CHECKM_DATA_PATH=/venv/checkm_data
    checkm lineage_wf $@
22
baseCommand: [ DAS_Tool ]
22
baseCommand: [ "Fasta_to_Contig2Bin.sh" ]
22
baseCommand: [ EukRep ]
137
baseCommand: [ fastp ]
6
baseCommand: [ fastqc ]
23
24
25
26
27
28
29
- entryname: script.sh
  entry: |-
    #!/bin/bash
    outname=$1
    longreads=$2
    shift;shift;
    filtlong $longreads $@ 2> >(tee -a $outname.filtlong.log>&2) | gzip > $outname.fastq.gz
16
baseCommand: [ flye ]
CWL Flye From line 16 of flye/flye.cwl
18
19
20
21
22
23
- entryname: script.sh
  entry: |-
    #!/bin/bash
    export GTDBTK_DATA_PATH=$1
    shift;
    gtdbtk classify_wf $@
6
baseCommand: [ kraken2 ]
15
baseCommand: [ ktImportTaxonomy ]
22
baseCommand: [ "run_MaxBin.pl" ]
17
baseCommand: [ medaka.py ]
22
baseCommand: [ memote ]
22
baseCommand: [aggregateBinDepths.pl]
23
24
25
26
27
baseCommand: [ metabat2 ]

arguments:
  - prefix: "--outFile"
    valueFrom: MetaBAT2_bins/$(inputs.identifier)_MetaBAT2_bin
23
24
25
26
27
28
baseCommand: [jgi_summarize_bam_contig_depths]

arguments:
  - position: 1
    prefix: '--outputDepth'
    valueFrom:  $(inputs.identifier)_contigDepths.tsv
38
baseCommand: ["python3", "/scripts/metagenomics/assembly_bins_readstats.py"]
41
baseCommand: ["python3", "/scripts/metagenomics/bins_summary.py"]
19
20
21
22
23
24
25
26
27
28
29
- entryname: script.sh
  entry: |-
    #!/bin/bash
    #   $1 = mapped/unmapped (-F -f)
    # 1 $2 = ref
    # 2 $3 = fastq
    # 3 $4 = preset (map-ont)
    # 4 $5 = threads
    # 5 $6 = identifier

    minimap2 -a -t $5 -x $4 $2 $3 | samtools fastq -@ $5 -n $1 4 | pigz -p $5 > $6_filtered.fastq.gz
27
baseCommand: [ NanoPlot ]
81
82
83
84
85
86
87
88
baseCommand: ["java"]

arguments:
  - "-jar"
  - "-Xmx$(inputs.memory)M"
  - "/venv/share/pilon-1.24-0/pilon.jar"
  - valueFrom: $(inputs.identifier)_pilon_polished
    prefix: "--output"
CWL From line 81 of pilon/pilon.cwl
18
19
20
21
22
23
24
25
26
27
28
baseCommand: [ prodigal ]

arguments:
  # - valueFrom: "sco" # What is the sco format?
  #   prefix: "-f"
  - valueFrom: $(inputs.input_fasta.nameroot).prodigal
    prefix: "-o"
  - valueFrom: $(inputs.input_fasta.nameroot).prodigal.ffn
    prefix: "-d"
  - valueFrom: $(inputs.input_fasta.nameroot).prodigal.faa
    prefix: "-a"
7
baseCommand: [ metaquast.py ]
CWL From line 7 of quast/metaquast.cwl
44
baseCommand: [ samtools, idxstats ]
39
40
41
42
43
baseCommand: [ samtools, index ]

arguments:
  - valueFrom: $(inputs.bam_file.basename).bai
    position: 2
17
18
19
20
- entryname: script.sh
  entry: |-
    #!/bin/bash
    samtools view -@ $2 -hu $3 | samtools sort -@ $2 -o $1.sorted.bam
24
baseCommand: [ SemiBin, single_easy_bin ]
24
baseCommand: [ smetana ]
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
- entryname: input_spades.json
  entry: |-
    [
      {
        orientation: "fr",
        type: "paired-end",
        right reads: $( inputs.forward_reads.map( function(x) {return  x.path} ) ),
        left reads: $( inputs.reverse_reads.map( function(x) {return  x.path} ) )
      }            
      ${
        var pacbio=""
          if (inputs.pacbio_reads != null) {
           pacbio+=',{ type: "pacbio", single reads: ["' + inputs.pacbio_reads.map( function(x) {return  x.path} ).join('","') + '"] }' 
        }
        return pacbio;
      }
      ${
        var nanopore=""
          if (inputs.nanopore_reads != null) {
           nanopore+=',{ type: "nanopore", single reads: ["' + inputs.nanopore_reads.map( function(x) {return  x.path} ).join('","') + '"] }'
          //  nanopore+=',{ type: "nanopore", single reads: ["' + inputs.nanopore_reads.join('","') + '"] }'
        }
        return nanopore;
      }
    ]
CWL From line 15 of spades/spades.cwl
51
52
53
54
55
56
57
baseCommand: [ spades.py, --dataset, input_spades.json ]

arguments:
  - valueFrom: $(runtime.outdir)/output
    prefix: -o
  - valueFrom: $(inputs.memory / 1000)
    prefix: --memory
CWL From line 51 of spades/spades.cwl
12
baseCommand: [ raw_n50 ]
ShowHide 17 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://gitlab.com/m-unlock/cwl/-/blob/master/cwl/workflows/workflow_metagenomics_GEM.cwl
Name: metagenomic-gems-from-assembly
Version: Version 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...