Genetic Code Prediction and Annotation in Uncultivated Ciliates from Marine Environments

public public 1yr ago Version: v1.0 0 bookmarks

Genetic code prediction from karyorelict and heterotrich ciliates

Among the ciliates, there is an unusually diverse number of genetic codes used by different species, compared with other groups of eukaryotes. The karyorelicts and heterotrichs have some of the most unusual types of codes, where stop codons are potentially ambiguous and can also code for amino acids.

Aims:

  • Assemble single cell transcriptome RNAseq data from uncultivated ciliates

  • Assemble whole-genome-amplification DNAseq data from uncultivated ciliates

  • Predict genetic codes from above assembles

  • Annotate genes from above assemblies, identify clear examples of ambiguous stop genetic codes

See our preprint (doi:10.1101/2022.04.12.488043) for more information.

Data

  • Single-cell transcriptome RNAseq from uncultivated marine ciliates, collected from Roscoff, France

  • Published datasets downloaded from SRA

Data deposition

Code Snippets

80
81
82
shell:
    "phyloFlash.pl -lib {wildcards.lib}_rnaseq_maponly_{wildcards.readlim} -readlength 150 -readlimit {wildcards.readlim} -skip_spades -read1 {input.fwd} -read2 {input.rev} -CPUs {threads} -html -treemap -log -poscov -zip -dbhome {params.db} 2> {log};"
    "mv {wildcards.lib}_rnaseq_maponly_{wildcards.readlim}.phyloFlash* qc/phyloFlash/;"
SnakeMake From line 80 of main/Snakefile
112
113
114
shell:
    "phyloFlash.pl -lib {wildcards.lib}_dnaseq -readlength 150 -read1 {input.fwd} -read2 {input.rev} -CPUs {threads} -almosteverything -dbhome {params.db} 2> {log};"
    "mv {wildcards.lib}_dnaseq.phyloFlash* qc/phyloFlash/;"
SnakeMake From line 112 of main/Snakefile
126
127
shell:
    "bbduk.sh -Xmx10g threads={threads} ref=resources/adapters.fa,resources/phix174_ill.ref.fa.gz in={input.fwd} in2={input.rev} ktrim=r qtrim=rl trimq=24 minlength=25 out={output.fwd} out2={output.rev} 2> {log}"
139
140
shell:
    "bbduk.sh -Xmx10g threads={threads} ref=resources/adapters.fa,resources/phix174_ill.ref.fa.gz in={input.fwd} in2={input.rev} ktrim=r qtrim=rl trimq=24 minlength=25 out={output.fwd} out2={output.rev} 2> {log}"
153
154
shell:
    "Trinity --seqType fq --max_memory 64G --bflyHeapSpaceMax 40G --CPU {threads} --full_cleanup --left {input.fwd} --right {input.rev} --output {params.prefix} &> {log};"
171
172
173
shell:
    "cat {input.assem} | parallel --gnu -j {threads} --recstart '>' -N 100 --pipe blastx -query - -db {params.db_prefix} -evalue 1e-20 -max_target_seqs 1 -outfmt 6 > {output.blastx};"
    "analyze_blastPlus_topHit_coverage.pl {output.blastx} {input.assem} {input.db} &> {log};"
SnakeMake From line 171 of main/Snakefile
184
185
shell:
    "bbduk.sh -Xmx10g threads={threads} ref=resources/adapters.fa,resources/phix174_ill.ref.fa.gz in={input} ktrim=r qtrim=rl trimq=24 minlength=25 out={output} 2> {log}"
196
197
shell:
    "Trinity --seqType fq --max_memory 64G --bflyHeapSpaceMax 40G --CPU {threads} --full_cleanup --single {input} --output {params.prefix} &> {log};"
ShowHide 6 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/Swart-lab/karyocode-workflow
Name: karyocode-workflow
Version: v1.0
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...