Snakemake workflow for metagenomic analysis of rumen samples from sheep

public public 1yr ago 0 bookmarks

Snakemake workflow for metagenomic analysis of rumen samples from sheep

Code Snippets

58
59
shell:
    'cat {input.read1} {input.read2} > {output.mergedReads}'
SnakeMake From line 58 of main/Snakefile
74
75
76
77
78
79
shell:
    'fastqc '
    '-o results/fastqc/ '
    '-q ' 
    '-t {threads} '
    '{input.fastq}'
90
91
92
93
94
95
96
shell:
    'multiqc '
    '-n results/ReadsMultiQCReportRawData '
    '-s ' 
    '-f ' 
    '--interactive ' 
    '{input.fastqc}'
123
124
125
126
127
128
129
130
131
132
133
134
shell:
    'kneaddata '
    '--unpaired {input.fastq} '
    '-t {threads} '
    '--log-level INFO '
    '--log {log} '
    '--trimmomatic /home/kima/conda-envs/biobakery/share/trimmomatic-0.39-2 ' 
    '--sequencer-source TruSeq3 ' # to identify correct adapter sequences
    '-db /bifo/scratch//2022-AK-MBIE-Rumen-MG/ref/ARS_UI_Ramb_v2 '
    '-db /bifo/scratch//2022-AK-MBIE-Rumen-MG/ref/SILVA_128_LSUParc_SSUParc_ribosomal_RNA '
    '-o results/kneaddata && '
    'seqkit stats -j 12 -a results/kneaddata/{wildcards.samples}*.fastq > {output.readStats}'
147
148
149
150
151
152
shell: 
    'fastqc '
    '-o results/fastqcKDR/ '
    '-q '
    '-t {threads} '
    '{input.fastqc}'
162
163
164
165
166
167
168
shell:
    'multiqc '
    '-n results/ReadsMultiQCReportKneadData '
    '-s '
    '-f '
    '--interactive '
    '{input.fastqc}'
187
188
189
190
191
192
193
194
shell:
    'kraken2 '
    '--use-names '
    '--db /dataset/2022-BJP-GTDB/scratch/2022-BJP-GTDB/kraken/GTDB '
    '-t {threads} '
    '--report {output.k2ReportGTDB} '
    '--report-minimizer-data '
    '{input.KDRs} > {output.k2OutputGTDB}'
209
210
211
212
213
214
215
216
217
218
shell: 
    'bracken '
    '-d /dataset/2022-BJP-GTDB/scratch/2022-BJP-GTDB/kraken/GTDB '
    '-i {input.k2ReportGTDB} '
    '-o {output.bOutput} '
    '-w {output.bReport} '
    '-r 240 ' # average read length
    '-l S '  # SPECIES
    '-t 10 ' # remove low abundance species (noise)  
    '&> {log} '
233
234
235
236
237
238
239
240
241
242
shell: 
    'bracken '
    '-d /dataset/2022-BJP-GTDB/scratch/2022-BJP-GTDB/kraken/GTDB '
    '-i {input.k2ReportGTDB} '
    '-o {output.bOutput} '
    '-w {output.bReport} '
    '-r 240 ' # average read length
    '-l G '  # GENUS
    '-t 10 ' # remove low abundance species (noise)  
    '&> {log} '
254
255
256
257
shell:
    'combine_bracken_outputs.py '
    '--files /bifo/scratch/2022-AK-MBIE-Rumen-MG/Snakemake-Metagenomics/results/brackenSpecies/*.bracken '
    '-o results/countMatrices/bracken_species.report'
SnakeMake From line 254 of main/Snakefile
268
269
270
271
shell:
    'combine_bracken_outputs.py '
    '--files /bifo/scratch/2022-AK-MBIE-Rumen-MG/Snakemake-Metagenomics/results/brackenGenus/*.bracken '
    '-o results/countMatrices/bracken_genus.report'
SnakeMake From line 268 of main/Snakefile
291
292
293
294
295
296
297
298
299
300
301
302
shell:
    'humann3 ' 
    '--memory-use minimum '
    '--threads {threads} '
    '--bypass-nucleotide-search '
    '--search-mode uniref50 '
    '--protein-database /bifo/scratch/2022-AK-MBIE-Rumen-MG/ref/humann3/unirefECFilt '
    '--input-format fastq '
    '--output results/humann3protein '
    '--input {input.KDRs} '
    '--output-basename {wildcards.samples} '
    '--o-log {log}'
SnakeMake From line 291 of main/Snakefile
312
313
314
315
316
shell:
    'humann_join_tables '
    '-i /bifo/scratch/2022-AK-MBIE-Rumen-MG/Snakemake-Metagenomics/results/humann3Uniref50EC '
    '--file_name genefamilies.tsv '
    '-o results/countMatrices/humann3_gene_families.tsv'
326
327
328
329
330
shell:
    'humann_join_tables '
    '-i /bifo/scratch/2022-AK-MBIE-Rumen-MG/Snakemake-Metagenomics/results/humann3Uniref50EC '
    '--file_name pathabundance.tsv '
    '-o results/countMatrices/humann3_path_abundance.tsv'
340
341
342
343
344
shell: 
    'humann_regroup_table '
    '-i {input} '
    '-c /bifo/scratch/2022-AK-MBIE-Rumen-MG/ref/humann3/utility_mapping/map_level4ec_uniref50.txt.gz '
    '-o {output}'
SnakeMake From line 340 of main/Snakefile
354
355
356
357
358
shell:
    'humann_rename_table '
    '-i {input} '
    '-n ec '
    '-o {output}'
SnakeMake From line 354 of main/Snakefile
368
369
370
371
372
shell: 
    'humann_regroup_table '
    '-i {input} '
    '-c /bifo/scratch/2022-AK-MBIE-Rumen-MG/ref/humann3/utility_mapping/map_ko_uniref50.txt.gz '
    '-o {output}'
SnakeMake From line 368 of main/Snakefile
382
383
384
385
386
shell:
    'humann_rename_table '
    '-i {input} '
    '-n kegg-orthology '
    '-o {output}'
SnakeMake From line 382 of main/Snakefile
ShowHide 12 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/alexmingikim/Snakemake-Metagenomics
Name: snakemake-metagenomics
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...