16S rRNA-Seq Analysis Workflow for Kelly Brendan's Lab Data

public public 1yr ago 0 bookmarks

This is using the standard Snakemake workflow template. Replace this text with a comprehensive description covering the purpose and domain. Insert your code into the respective folders, i.e. scripts , rules , and envs . Define the entry point of the workflow in the Snakefile and the main configuration in the config.yaml file.

The data is 16s (V1-V2) rRNA-seq from Kelly Brendan's lab, published on RSA with bioproject ID: PRJNA682076 Explore both ASV and OTU.

Usage

Running on new respublica by: snakemake --latency-wait 10 -j 10 -p -c "sbatch --job-name={params.jobName} --mem={params.mem} -c {threads} --time=360 -e sbatch/{params.jobName}.e -o sbatch/{params.jobName}.o"

Code Snippets

14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
shell:
    '''
    conda activate qiime2-2020.11
    qiime dada2 denoise-paired \
        --i-demultiplexed-seqs {input.q2_import} \
        --p-trunc-len-f {config[truncation_len-f]} \
        --p-trunc-len-r {config[truncation_len-r]} \
        --p-n-reads-learn {config[training]} \
        --p-n-threads {threads} \
        --p-chimera-method {config[chimera]} \
        --o-table {output.table} \
        --o-representative-sequences {output.seq} \
        --o-denoising-stats {output.stats} --verbose &> {log}
    conda deactivate
    '''
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
shell:
    '''
    conda activate qiime2-2020.11
    qiime metadata tabulate \
        --m-input-file {input.stats} \
        --o-visualization {output.stats_viz} --verbose &> {log}

    qiime feature-table summarize \
        --i-table {input.table} \
        --o-visualization {output.table_viz} \
        --m-sample-metadata-file {input.stats} --verbose &>> {log}

    qiime feature-table tabulate-seqs \
        --i-data {input.seq} \
        --o-visualization {output.seq_viz} --verbose &>> {log}

    conda deactivate   
    '''
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
shell:
    '''
    ##  --p-sampling-depth should be carefully chosen by reviewing the table_summary file -asv-table.qzv
    conda activate qiime2-2019.7

    # echo "this tis to test conda..." &> {log}
    qiime diversity core-metrics-phylogenetic \\
        --i-phylogeny {input.rooted_tree} \\
        --i-table {input.feature_table} \\
        --p-sampling-depth {config[sampling_depth]} \\
        --p-n-jobs {threads} \\
        --m-metadata-file {input.metadata} \\
        --o-rarefied-table {output.rarefied_table} \\
        --o-faith-pd-vector {output.faith_pd_vector} \\
        --o-observed-otus-vector {output.observed_otus_vector} \\
        --o-shannon-vector {output.shannon_vector} \\
        --o-evenness-vector {output.evenness_vector} \\
        --o-unweighted-unifrac-distance-matrix {output.unweighted_unifrac_distance_matrix} \\
        --o-weighted-unifrac-distance-matrix {output.weighted_unifrac_distance_matrix} \\
        --o-jaccard-distance-matrix {output.jaccard_distance_matrix} \\
        --o-bray-curtis-distance-matrix {output.bray_curtis_distance_matrix} \\
        --o-unweighted-unifrac-pcoa-results {output.unweighted_unifrac_pcoa_results} \\
        --o-weighted-unifrac-pcoa-results {output.weighted_unifrac_pcoa_results} \\
        --o-jaccard-pcoa-results {output.jaccard_pcoa_results} \\
        --o-bray-curtis-pcoa-results {output.bray_curtis_pcoa_results} \\
        --o-unweighted-unifrac-emperor {output.unweighted_unifrac_emperor} \\
        --o-weighted-unifrac-emperor {output.weighted_unifrac_emperor} \\
        --o-jaccard-emperor {output.jaccard_emperor} \\
        --o-bray-curtis-emperor {output.bray_curtis_emperor} &> {log}

    conda deactivate   
    '''
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
shell:
    '''
    conda activate qiime2-2019.7
    qiime diversity alpha-group-significance \\
        --i-alpha-diversity {input.faith_pd_vector}\\
        --m-metadata-file {input.metadata} \\
        --o-visualization {output.faith_pd_group_significance} &> {log}


    qiime diversity alpha-group-significance \\
        --i-alpha-diversity {input.evenness_vector} \\
        --m-metadata-file {input.metadata} \\
        --o-visualization {output.evenness_group_significance} &>> {log}  

    conda deactivate
    '''
110
111
112
113
114
115
116
117
118
119
120
121
shell:
    '''
    conda activate qiime2-2019.7
    qiime diversity beta-group-significance \\
        --i-distance-matrix {input.unweighted_unifrac_distance_matrix} \\
        --m-metadata-file {input.metadata} \\
        --m-metadata-column recurrence_within_180_days \\
        --o-visualization {output.unweighted_unifrac_recurrence_significance} \\
        --p-pairwise &> {log} 

    conda deactivate
    '''
10
11
12
13
14
shell:
    '''
    grabseqs sra {params.project} -m {output.metadata} -o {output.outdir} -r 3

    '''
15
16
17
18
19
20
21
22
23
24
25
26
27
28
shell:
    '''
    conda activate qiime2-2019.7
    qiime phylogeny align-to-tree-mafft-fasttree \
        --p-n-threads {threads} \
        --i-sequences {input.rep} \
        --o-alignment {output.alignment} \
        --o-masked-alignment {output.masked_alignment} \
        --o-tree {output.unrooted_tree} \
        --o-rooted-tree {output.rooted_tree} \
        --verbose &> {log}

    conda deactivate   
    '''
12
13
shell:
  "fastqc --quiet -t {threads} --outdir ../results/fastqc {input} &> {log}"
31
32
33
34
35
shell:
    '''  
      trimmomatic PE -threads {threads} -phred33 -quiet {input.r1} {input.r2} \
      {output.r1} {output.r1_unpaired} {output.r2} {output.r2_unpaired} {params.trimmer}
    '''
48
49
shell:
  "fastqc --quiet -t {threads} --outdir ../results/fastqc_trim {input} &> {log}"
63
64
65
66
67
68
shell: 
  """
  multiqc --force --quiet --filename multiqc.html --outdir ../results/raw_multi_fastqc {input.raw_qc} &> {log} #run multiqc
  # repeat for trimmed data
  multiqc --force --quiet --filename multiqc.html --outdir ../results/trim_multi_fastqc {input.trim_qc} &>> {log} #run multiqc
  """ 
15
16
17
18
19
20
21
22
23
24
shell:
    '''
    # Imports demultiplexed paired end FASTQ files
    # Needed to create a unique manifest file to map file paths to sample ids
    qiime tools import \
    --type 'SampleData[PairedEndSequencesWithQuality]' \
    --input-path {input} \
    --input-format PairedEndFastqManifestPhred33V2 \
    --output-path {output.q2_import} &> {log}
    '''
17
18
19
20
21
22
shell:
    '''
    # Creates a QIIME2 summary artifact on demultiplexed FASTQ sequences
    qiime demux summarize --i-data {input.q2_import} --o-visualization {output.raw} &> {log}
    qiime demux summarize --i-data {input.q2_primerRM} --o-visualization {output.primer} &>> {log}
    '''
14
15
16
17
18
19
20
21
22
23
24
shell:
    '''
    qiime cutadapt trim-paired \
    --p-cores {threads} \
    --i-demultiplexed-sequences {input.q2_import} \
    --p-front-f {config[primerF]} \
    --p-front-r {config[primerR]} \
    --p-error-rate {config[primer_err]} \
    --p-overlap {config[primer_overlap]} \
    --o-trimmed-sequences {output.q2_primerRM} &> {log}
    '''
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
shell:
    '''
    conda activate qiime2-2019.7

    qiime feature-table filter-samples \
        --i-table {input.table} \
        --m-metadata-file {input.metadata} \
        --o-filtered-table {output.table_cdi_pos} &> {log}

    qiime feature-table summarize \
        --i-table {output.table_cdi_pos} \
        --o-visualization {output.table_cdi_pos_viz} \
        --m-sample-metadata-file {input.metadata} --verbose &>> {log}   

    qiime feature-table filter-seqs \
        --i-data {input.seq} \
        --i-table {output.table_cdi_pos} \
        --o-filtered-data {output.seq_cdi_pos} &>> {log} 

    qiime feature-table tabulate-seqs \
        --i-data {output.seq_cdi_pos} \
        --o-visualization {output.seq_cdi_pos_viz} --verbose &>> {log}
     conda deactivate
    '''    
19
20
21
22
23
24
25
26
27
28
shell:
  '''
    conda activate qiime2-2019.7
    qiime feature-classifier classify-sklearn \
      --i-classifier {input.classifier} \
      --i-reads {input.seq_cdi_pos} \
      --o-classification {output.taxonomy} &> {log}

    conda deactivate    
  '''
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
shell:
  '''
  qiime metadata tabulate \
    --m-input-file {input.taxonomy} \
    --o-visualization {output.taxonomy_viz} &> {log}

  qiime tools export \
    --input-path {input.taxonomy} \
    --output-path {params.outdir} &>> {log}

  qiime taxa barplot \
    --i-table {input.feature_table} \
    --i-taxonomy {input.taxonomy} \
    --m-metadata-file {input.metadata} \
    --o-visualization {output.taxa_barplot_viz} &>> {log}  
  '''
ShowHide 11 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/chaodi51/tag_seq
Name: tag_seq
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: MIT License
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...