Workflow Steps and Code Snippets

176 tagged steps and code snippets that match keyword seqtk

Snakemake-based workflow for the assembly of chloroplast genomes

shell:
    """
    if [ -s {input.genome} ]
    then
        mkdir -p {params.dir}
        for contig in `grep '^>' {input.genome} | sed -e 's/>//g'`
        do
            echo $contig > {params.dir}/tmp
            seqtk subseq {input.genome} {params.dir}/tmp > {params.dir}/$contig.fasta

            nucmer --maxmatch {input.index} {params.dir}/$contig.fasta -p {params.dir}/out
            show-coords -THrd {params.dir}/out.delta > {params.dir}/out.coords
            start=`sort -k6,6hr {params.dir}/out.coords | head -n 1| cut -f3`
            echo ">$contig" >> {output}
            echo "$start XXX"
            if [ $start == 1 ]
            then
                grep -v '^>' {params.dir}/$contig.fasta | tr -d '\n' >> {output}
                echo "" >> {output} 
            elif [ ! -z $start ]
            then
                grep -v '^>' {params.dir}/$contig.fasta | tr -d '\n' > {params.dir}/temp.fasta
                cut -c ${{start}}- {params.dir}/temp.fasta > {params.dir}/start.fasta
                cut -c -$[start-1] {params.dir}/temp.fasta > {params.dir}/end.fasta
                cat {params.dir}/start.fasta {params.dir}/end.fasta | tr -d '\n' >> {output}
                echo "" >> {output}
            else
                grep -v '^>' {params.dir}/$contig.fasta | tr -d '\n' >> {output}
                echo "" >> {output}
            fi
            rm -rf {params.dir}/*
        done
        rm -rf {params.dir}
    else
        touch {output}
    fi
    """

SnakeMake seqtk subSeq mummer2circos From line 82 of main/Snakefile

shell:
    """
    echo {params.random_seed}
    seqtk sample -s {params.random_seed} {input} {config[number_reads]} > {output}
    """

SnakeMake seqtk From line 332 of main/Snakefile

shell:
    """
    samtools view {input.bam} | cut -f1 | sort | uniq > {output.list}
    seqtk subseq {input.fastFile} {output.list} \
        | bioawk -c fastx \
            'length($seq) > {config[read_min_length]} && length($seq) < {config[chloroplast_size]} \
            {{print \">\"$name\"\\n\"$seq}}' > {output.fastFile}
    """

SnakeMake SAMtools seqtk subSeq bioawk From line 355 of main/Snakefile

shell:
    """
    awk 'NR == 1 {{print substr($1,2,length($1)), \"0\", \"10000\"}}' {input} > chloro_assembly/reference/index.bed
    seqtk subseq {input} chloro_assembly/reference/index.bed > {output}
    rm chloro_assembly/reference/index.bed
    """

SnakeMake seqtk subSeq From line 405 of main/Snakefile

A repository to conduct experiments with omnitig-related models for genome assembly. (v0.4.3)

shell:  "${{CONDA_PREFIX}}/bin/time -v seqtk seq -AU '{input.reads}' > '{output.reads}'"

SnakeMake seqtk From line 2177 of master/Snakefile

A pipeline for lightweight screening of Eukaryotic genomes and transcriptomes for recent HGT (v1.0.0)

shell:'''
seqtk subseq {input.fa} {input.gene_lst} > {output}
'''

SnakeMake seqtk subSeq From line 262 of main/Snakefile

Modular Shotgun Sequence Analysis Workflow: Oecophylla - Harnessing Snakemake

run:
    aln2ext = {'utree': 'tsv', 'burst': 'b6', 'bowtie2': 'sam'}
    ext = aln2ext[params['aligner']]
    with tempfile.TemporaryDirectory(dir=find_local_scratch(TMP_DIR_ROOT)) as temp_dir:
        shell("""
              set +u; {params.env}; set -u

              # get stem file path
              stem={output.profile}
              stem=${{stem%.profile.txt}}

              # interleave paired fastq's and convert to fasta
              seqtk mergepe {input.forward} {input.reverse} | \
              seqtk seq -A > {temp_dir}/{wildcards.sample}.fna

              # map reads to reference database
              shogun align \
              --aligner {params.aligner} \
              --threads {threads} \
              --database {params.db} \
              --input {temp_dir}/{wildcards.sample}.fna \
              --output {temp_dir} \
              2> {log} 1>&2

              # build taxonomic profile based on read map
              shogun assign_taxonomy \
              --aligner {params.aligner} \
              --database {params.db} \
              --input {temp_dir}/alignment.{params.aligner}.{ext} \
              --output {output.profile} \
              2> {log} 1>&2

              # keep mapping file
              if [[ "{params.map}" == "True" ]]
              then
                gzip -c {temp_dir}/alignment.{params.aligner}.{ext} > $stem.{params.aligner}.{ext}.gz
              fi

              # redistribute reads to given taxonomic ranks
              if [[ ! -z {params.levels} ]]
              then
                IFS=',' read -r -a levels <<< "{params.levels}"
                for level in "${{levels[@]}}"
                do
                  shogun redistribute \
                  --database {params.db} \
                  --level $level \
                  --input {output.profile} \
                  --output $stem.redist.$level.txt \
                  2> {log} 1>&2
                done
              fi
              """)

SnakeMake Bowtie 2 seqtk From line 334 of taxonomy/taxonomy.rule

HIV Drug Resistance Profiling Pipeline using Bowtie2, Lofreq, and SierraPy

shell:
    """
    seqtk trimfq {input.reads1} > {output.trim1}
    seqtk trimfq {input.reads2} > {output.trim2}
    """

SnakeMake seqtk From line 12 of rules/filter_reads.smk

shell:
    """
    seqtk sample -s {params.seed} {input.trim1} {params.n} > {output.sub1}
    seqtk sample -s {params.seed} {input.trim2} {params.n} > {output.sub2}
    """

SnakeMake seqtk From line 32 of rules/filter_reads.smk

Repository for the Microbiology Resource Announcements paper on five complete Streptococcus suis genomes

seqtk comp $1 | awk '{gc += ($4 + $5)} {at += ($3 + $6)} END {print gc/(gc + at)}'

Shell seqtk From line 2 of scripts/get_gc.sh

tool / biotools

seqtk

A tool for processing sequences in the FASTA or FASTQ format. It parses both FASTA and FASTQ files which can also be optionally compressed by gzip.

References:

seqtk

operation_2409

operation_2121

lh3/seqtk

10.1186/s13104-017-2616-7

Metadata:

topic: Data management

toolType: Command-line tool

operation: Sequence file editing

signature: seqtk

108 177 0 0

Are you the author of this tool? Add a badge to your website with a link to this page.

Insert the following code to your website to add

with a link to this page.