LongRead Quality Control and Filtering Workflow for Accurate Taxonomic Classification and Enhanced Data Quality

public public 1yr ago Version: Version 1 0 bookmarks

Workflow for LongRead Quality Control and Filtering

  • NanoPlot (read quality control) before and after filtering
  • Filtlong (read trimming)
  • Kraken2 taxonomic read classification before and after filtering
  • Minimap2 read filtering based on given references

Code Snippets

16
17
18
19
baseCommand: [pigz, -c]

arguments:
  - valueFrom: $(inputs.inputfile)
CWL From line 16 of bash/pigz.cwl
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
- entryname: script.sh
  entry: |-
    #!/bin/bash
    echo -e "\
    #/usr/bin/python3
    import sys\n\
    headers = set()\n\
    c = 0\n\
    for line in sys.stdin:\n\
      splitline = line.split()\n\
      if line[0] == '>':    \n\
        if splitline[0] in headers:\n\
          c += 1\n\
          print(splitline[0]+'.x'+str(c)+' '+' '.join(splitline[1:]))\n\
        else:\n\
          print(line.strip())\n\
        headers.add(splitline[0])\n\
      else:\n\
        print(line.strip())" > ./dup.py
    out_name=$1
    shift

    if file $@ | grep gzip; then
      zcat $@ | python3 ./dup.py | gzip > $out_name
    else
      cat $@ | python3 ./dup.py | gzip > $out_name
    fi
23
24
25
26
27
28
29
- entryname: script.sh
  entry: |-
    #!/bin/bash
    outname=$1
    longreads=$2
    shift;shift;
    filtlong $longreads $@ 2> >(tee -a $outname.filtlong.log>&2) | gzip > $outname.fastq.gz
6
baseCommand: [ kraken2 ]
15
baseCommand: [ ktImportTaxonomy ]
19
20
21
22
23
24
25
26
27
28
29
- entryname: script.sh
  entry: |-
    #!/bin/bash
    #   $1 = mapped/unmapped (-F -f)
    # 1 $2 = ref
    # 2 $3 = fastq
    # 3 $4 = preset (map-ont)
    # 4 $5 = threads
    # 5 $6 = identifier

    minimap2 -a -t $5 -x $4 $2 $3 | samtools fastq -@ $5 -n $1 4 | pigz -p $5 > $6_filtered.fastq.gz
27
baseCommand: [ NanoPlot ]
ShowHide 2 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://gitlab.com/m-unlock/cwl/-/blob/master/cwl/workflows/workflow_nanopore_quality.cwl
Name: longread-quality-control-and-filtering
Version: Version 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...