HRIBO: High-throughput annotation by Ribo-seq workflow for analyzing bacterial Ribo-seq data

public public 1yr ago Version: 1.7.1 0 bookmarks

We present HRIBO (High-throughput annotation by Ribo-seq), a workflow to enable reproducible and high-throughput analysis of bacterial Ribo-seq data. The workflow performs all required pre-processing steps and quality control. Importantly, HRIBO outputs annotation-independent ORF predictions based on two complementary prokaryotic-focused tools, and integrates them with additional computed features. This facilitates both the rapid discovery of ORFs and their prioritization for functional characterization.

For a detailed description of this workflow, the installation, usage and examples, please refer to the ReadTheDocs documentation .

HRIBO installs all dependencies via conda . Once you have conda installed simply type:

 conda create -c bioconda -c conda-forge -n snakemake snakemake source activate snakemake

Basic usage

The retrieval of input files and running the workflow locally and on a server cluster via a queuing system is working as follows. Create a project directory and change into it:

 mkdir project cd project

Retrieve the HRIBO from GitHub:

 git clone git@github.com:gelhausr/HRIBO.git

The workflow requires a genome sequence (fasta), an annotation file (gtf) and the sequencing results files (fastq). We recommend retrieving both the genome and the annotation files from Ensembl Genomes . Copy the genome and the annotation file into the project folder, decompress them and name them genome.fa and annotation.gtf.

Create a folder fastq and copy your compressed fastq.gz files into the fastq folder.

Please copy the template of the sample sheet and the config file into the HRIBO folder.

 cp HRIBO/templates/config.yaml HRIBO/ cp HRIBO/templates/samples.tsv HRIBO/

Customize the config.yaml with the used adapter sequence and optionally with the path to a precomputed STAR genome index. For correct removal of reads mapping to ribosomal genes please specify the taxonomic group of the used organism (Eukarya, Bacteria, Archea). Now edit the sample sheet corresponding to your project, using one line per sequencing result, stating the used method (RIBO for ribosome profiling, RNA for RNA-seq), the applied condition (e.g. A, B, CTRL, TREAT), the replicate (e.g. 1, 2,..) and the filename. Following is an example:

method condition replicate fastqFile
RIBO A 1 "fastq/FP-ctrl-1-2.fastq.gz"
RIBO B 1 "fastq/FP-treat-1-2.fastq.gz"
RNA A 1 "fastq/Total-ctrl-1-2.fastq.gz"
RNA B 1 "fastq/Total-treat-1-2.fastq.gz"

Now you can start your workflow.

Run Snakemake locally:

 snakemake --use-conda -s HRIBO/Snakefile --configfile HRIBO/config.yaml --directory ${PWD} -j 20 --latency-wait 60 

Run Snakemake on the cluster:

Edit cluster.yaml according to your queuing system and cluster hardware. The following example works for Grid Engine:

 snakemake --use-conda -s HRIBO/Snakefile --configfile HRIBO/config.yaml --directory ${PWD} -j 20 --cluster-config HRIBO/cluster.yaml --cluster "qsub -N {cluster.jobname} -cwd -q {cluster.qname} -pe {cluster.parallelenvironment} -l {cluster.memory} -o {cluster.logoutputdir} -e {cluster.erroroutputdir} -j {cluster.joinlogs} -M <email>" --latency-wait 60 

Once the workflow has finished you can request a automatically generated report.html file with the following command:

 snakemake --report report.html

Code Snippets

21
22
shell:
    "mkdir -p auxiliary; HRIBO/scripts/enrich_annotation.py -a {input.annotation} -o {output}"
32
33
34
35
36
shell:
    """
    mkdir -p auxiliary;
    awk -F'\\t' '/^[^#]/ {{printf "%s\\t%s\\t%s\\t%s\\t%s\\t%s\\t%s\\t%s\\tID=uid%s;\\n", $1, $2, $3, $4, $5, $6, $7, $8, NR-1}}' {input} > {output}
    """
46
47
shell:
    "mkdir -p auxiliary; HRIBO/scripts/samples_to_xlsx.py -i {input} -o {output}"
59
60
shell:
    "mkdir -p auxiliary; HRIBO/scripts/generate_excel.py -t {input.total} -r {input.reads} -g {input.genome} -o {output}"
72
73
shell:
    "mkdir -p auxiliary; HRIBO/scripts/generate_excel.py -t {input.total} -r {input.reads} -g {input.genome} -o {output}"
85
86
shell:
    "mkdir -p auxiliary; HRIBO/scripts/generate_excel_reparation.py -t {input.total} -r {input.reads} -g {input.genome} -o {output}"
97
98
shell:
    "mkdir -p auxiliary; HRIBO/scripts/generate_read_table.py -r {input.reads} -t {input.total} -o {output}"
109
110
shell:
    "mkdir -p auxiliary; HRIBO/scripts/generate_read_table.py -r {input.reads} -t {input.total} -o {output}"
125
126
127
128
129
130
131
132
133
134
shell:
    """
    mkdir -p auxiliary;
    if [ -z {params.contrasts} ]
    then
        HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -g {input.genome} -t {input.totalreads} --mapped_reads_reparation {input.reparation} -o {output}
    else
        HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -c {params.contrasts} -g {input.genome} -t {input.totalreads} --mapped_reads_reparation {input.reparation} -o {output}
    fi
    """
150
151
152
153
154
155
156
157
158
159
shell:
    """
    mkdir -p auxiliary;
    if [ -z {params.contrasts} ]
    then
        HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -g {input.genome} -t {input.totalreads} --mapped_reads_deepribo {input.deepribo} --mapped_reads_reparation {input.reparation} -o {output}
    else
        HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -c {params.contrasts}  -g {input.genome} -t {input.totalreads} --mapped_reads_deepribo {input.deepribo} --mapped_reads_reparation {input.reparation} -o {output}
    fi
    """
177
178
179
180
181
182
183
184
185
shell:
    """
    if [ -z {params.contrasts} ]
    then
        mkdir -p auxiliary; HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -g {input.genome} --xtail {input.xtail} --deltate {input.deltate} --riborex {input.riborex} -t {input.totalreads} --mapped_reads_reparation {input.reparation} -o {output}
    else
        mkdir -p auxiliary; HRIBO/scripts/generate_excel_overview.py -c {params.contrasts} -a {input.annotation} -g {input.genome} --xtail {input.xtail} --deltate {input.deltate} --riborex {input.riborex} -t {input.totalreads} --mapped_reads_reparation {input.reparation} -o {output}
    fi
    """
204
205
206
207
208
209
210
211
212
shell:
    """
    if [ -z {params.contrasts} ]
    then
        mkdir -p auxiliary; HRIBO/scripts/generate_excel_overview.py -a {input.annotation} -g {input.genome} --xtail {input.xtail} --deltate {input.deltate} --riborex {input.riborex} -t {input.totalreads} --mapped_reads_deepribo {input.deepribo} --mapped_reads_reparation {input.reparation} -o {output}
    else
        mkdir -p auxiliary; HRIBO/scripts/generate_excel_overview.py -c {params.contrasts} -a {input.annotation} -g {input.genome} --xtail {input.xtail} --deltate {input.deltate} --riborex {input.riborex} -t {input.totalreads} --mapped_reads_deepribo {input.deepribo} --mapped_reads_reparation {input.reparation} -o {output}
    fi
    """
10
11
12
13
14
shell:
    """
    mkdir -p tracks;
    HRIBO/scripts/concatenate_gff.py {input.reparation_orfs} {input.currentAnnotation} -o {output}
    """
18
19
run:
    shell("mkdir -p deepribo; mv {input} deepribo/DeepRibo_model_v1.pt")
32
33
shell:
    "mkdir -p coverage_deepribo; HRIBO/scripts/coverage_deepribo.py --alignment_file {input.bam} --output_file_prefix coverage_deepribo/{wildcards.condition}-{wildcards.replicate}"
45
46
47
48
49
50
shell:
    """
    mkdir -p coverage_deepribo
    bedtools genomecov -bg -ibam {input.bam} -strand + > {output.covfwd}
    bedtools genomecov -bg -ibam {input.bam} -strand - > {output.covrev}
    """
65
66
67
68
69
70
shell:
    """
    mkdir -p deepribo/{wildcards.condition}-{wildcards.replicate}/0/;
    mkdir -p deepribo/{wildcards.condition}-{wildcards.replicate}/1/;
    DataParser.py {input.covS} {input.covAS} {input.asiteS} {input.asiteAS} {input.genome} deepribo/{wildcards.condition}-{wildcards.replicate} -g {input.annotation}
    """
80
81
shell:
    "mkdir -p deepribo; Rscript HRIBO/scripts/parameter_estimation.R -f {input} -o {output}"
 96
 97
 98
 99
100
shell:
    """
    mkdir -p deepribo;
    DeepRibo.py predict deepribo/ --pred_data {wildcards.condition}-{wildcards.replicate}/ -r {params.rpkm} -c {params.cov} --model {input.model} --dest {output} --num_workers {threads}
    """
110
111
shell:
    "mkdir -p tracks; HRIBO/scripts/create_deepribo_gff.py -c {wildcards.condition} -r {wildcards.replicate} -i {input} -o {output}"
121
122
shell:
    "mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input} -o {output}"
132
133
shell:
    "mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input.merged_gff} -o {output}"
145
146
shell:
    "mkdir -p tracks; HRIBO/scripts/merge_duplicates_deepribo.py -i {input.ingff} -o {output.merged} -a {input.annotation}"
159
160
shell:
    "mkdir -p auxiliary; HRIBO/scripts/generate_excel_deepribo.py -t {input.total} -r {input.reads} -g {input.genome} -o {output}"
172
173
174
175
176
shell:
    """
    mkdir -p tracks;
    HRIBO/scripts/concatenate_gff.py {input.deepribo_orfs} {input.reparation_orfs} {input.currentAnnotation} -o {output}
    """
4
5
6
7
8
9
run:
    if not os.path.exists("contrasts"):
        os.makedirs("contrasts")
    for f in CONTRASTS:
        print(f)
        open(f"contrasts/{f}", 'a').close()
22
23
24
25
26
shell:
    """
    mkdir -p diffex_input/riborex/;
    python3 HRIBO/scripts/prepare_diffex_input.py -r {input.rawreads} -c {wildcards.contrast}  -t riborex -o diffex_input/riborex/
    """
40
41
42
43
44
shell:
    """
    mkdir -p diffex_input/xtail/;
    python3 HRIBO/scripts/prepare_diffex_input.py -r {input.rawreads} -c {wildcards.contrast} -t xtail -o diffex_input/xtail/
    """
31
32
33
34
35
shell:
    """
    mkdir -p deltate;
    HRIBO/scripts/prepare_deltate_input.py -c {params.contrast} -r {input.rawreads} -b bam/ -o {params.out_dir}
    """
55
56
57
58
59
60
61
62
63
64
shell:
    """
    mkdir -p deltate;
    touch {output.fcribo}
    touch {output.fcrna}
    touch {output.fcte}
    touch deltate/{params.contrast}/Result_figures.pdf
    DTEG.R {input.ribo} {input.rna} {input.samples} 0 deltate/{params.contrast}/ || true
    cp deltate/{params.contrast}/Result_figures.pdf {output.fig}
    """
81
82
83
84
shell:
    """
    python3 HRIBO/scripts/generate_excel_deltate.py -a {input.annotation} -g {input.genome} -i {input.deltate_ribo} -r {input.deltate_rna} -t {input.deltate_te} -o {output.xlsx_sorted} --padj_cutoff {params.padj_cutoff} --log2fc_cutoff {params.log2fc_cutoff}
    """
94
95
96
97
shell:
    """
    python3 HRIBO/scripts/merge_differential_expression.py {input.deltate} -o {output} -t deltate
    """
12
13
14
15
16
shell:
    """
    mkdir -p riborex;
    HRIBO/scripts/riborex.R -r {input.ribo} -m {input.rna} -c {input.cv} -x {output.table};
    """
31
32
33
34
shell:
    """
    python3 HRIBO/scripts/generate_excel_riborex.py -a {input.annotation} -g {input.genome} -i {input.riborex_out} -o {output.xlsx_sorted} --padj_cutoff {params.padj_cutoff} --log2fc_cutoff {params.log2fc_cutoff}
    """
44
45
46
47
shell:
    """
    python3 HRIBO/scripts/merge_differential_expression.py {input.riborex} -o {output} -t riborex
    """
14
15
16
17
18
shell:
    """
    mkdir -p xtail;
    HRIBO/scripts/xtail.R -r {input.ribo} -m {input.rna} -c {input.cv} -x {output.table} -f {output.fcplot} -p {output.rplot};
    """
33
34
35
36
shell:
    """
    python3 HRIBO/scripts/generate_excel_xtail.py -a {input.annotation} -g {input.genome} -i {input.xtail_out} -o {output.xlsx_sorted} --padj_cutoff {params.padj_cutoff} --log2fc_cutoff {params.log2fc_cutoff}
    """
46
47
48
49
shell:
    """
    python3 HRIBO/scripts/merge_differential_expression.py {input.xtail} -o {output} -t xtail
    """
11
12
shell:
    "mkdir -p genomeSegemehlIndex; echo \"Computing Segemehl index\"; segemehl.x --threads {threads} -x {output.index} -d {input.genome} 2> {log}"
40
41
42
43
shell:
    """
    mkdir -p sammulti; segemehl.x -e -d {input.genome} -i {input.genomeSegemehlIndex} {params.fastq} --threads {threads} -o {output.sammulti} 2> {log}
    """
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
shell:
    """
    set +e
    mkdir -p sam
    awk '$2 == "4"' {input.sammulti} > {input.sammulti}.unmapped
    gawk -i inplace '$2 != "4"' {input.sammulti}
    samtools view -H <(cat {input.sammulti}) | grep '@HD' > {output.sam}
    samtools view -H <(cat {input.sammulti}) | grep '@SQ' | sort -t$'\t' -k1,1 -k2,2V >> {output.sam}
    samtools view -H <(cat {input.sammulti}) | grep '@RG' >> {output.sam}
    samtools view -H <(cat {input.sammulti}) | grep '@PG' >> {output.sam}
    cat {input.sammulti} |grep -v '^@' | grep -w 'NH:i:1' >> {output.sam}
    exitcode=$?
    if [ $exitcode -eq 1 ]
    then
        exit 1
    else
        exit 0
    fi
    """
82
shell: "if [ \"{params.method}\" == \"NOTSET\" ]; then HRIBO/scripts/sam_strand_inverter.py --sam_in_filepath={input.sam} --sam_out_filepath={output.sam}; else cp {input.sam} {output.sam}; fi"
92
93
shell:
    "mkdir -p bammulti; samtools view -@ {threads} -bh {input.sam} | samtools sort -@ {threads} -o {output} -O bam"
103
104
shell:
    "mkdir -p rRNAbam; samtools view -@ {threads} -bh {input.sam} | samtools sort -@ {threads} -o {output} -O bam"
115
116
shell:
    "mkdir -p maplink; ln -s {params.inlink} {params.outlink}"
 9
10
shell:
    "mkdir -p tracks; cat {input.reparation} >> {output}.unsorted; bedtools sort -i {output}.unsorted > {output};"
20
21
shell:
    "mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input.mergedGff} -o {output}"
31
32
shell:
    "mkdir -p tracks; HRIBO/scripts/merge_duplicates_reparation.py -i {input} -o {output}"
43
44
shell:
    "mkdir -p tracks; HRIBO/scripts/reannotate_orfs.py -a {input.annotation} -c {input.reparation} -o {output}"
54
55
shell:
    "mkdir -p tracks; HRIBO/scripts/annotation_unite.py -a {input} -o {output}"
17
18
19
20
21
shell:
    """
    mkdir -p metageneprofiling;
    HRIBO/scripts/read_length_statistics.py -a {input.bamfiles} -r {params.readlengths} -o metageneprofiling/ > {log}
    """
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
shell:
    """
    mkdir -p metageneprofiling;
    if [ {params.colorList} == nocolor ]; then
        colorList="";
    else
        colorList="--color_list {params.colorList}";
    fi;
    HRIBO/scripts/metagene_profiling.py -b {input.bam} -g {input.genome} -a {input.annotation} -o {output.meta} \
        --read_lengths {params.readlengths} \
        --normalization_methods {params.normalizationMethods} \
        --mapping_methods {params.mappingMethods} \
        --positions_in_ORF {params.positionsInORF} \
        --positions_out_ORF {params.positionsOutORF} \
        --filtering_method {params.filteringMethods} \
        --neighboring_genes_distance {params.neighboringGenesDistance} \
        --rpkm_threshold {params.rpkmThreshold} \
        --length_cutoff {params.lengthCutoff} \
        --output_formats {params.outputFormats} \
        --include_plotly_js {params.includePlotlyJS} \
        ${{colorList}}; > {log}
    """
11
12
13
14
15
16
shell:
    """
    mkdir -p pca;
    sed -e '1s/-/_/g' {input.rawreads} > {output.rawreads};
    HRIBO/scripts/preparePCAinput.py -s {input.samples} -o {output.meta};
    """
SnakeMake From line 11 of rules/pca.smk
32
33
34
35
36
shell:
    """
    mkdir -p pca;
    HRIBO/scripts/analyse_variance.R -r {input.rawreads} -m {input.meta} -o pca/;
    """
SnakeMake From line 32 of rules/pca.smk
49
50
51
52
53
shell:
    """
    mkdir -p pca;
    HRIBO/scripts/plot_PCA.py -r {input.rld} -p {input.pvar} -c {input.cor} -o pca/;
    """
SnakeMake From line 49 of rules/pca.smk
7
8
shell:
    "mkdir -p genomes; cp {input.genome} genomes/genome.fa"
16
17
shell:
    "mkdir -p annotation; cp {input.annotation} annotation/annotation.gff"
25
26
shell:
    "mkdir -p annotation; HRIBO/scripts/gtf2gff3.py -a {input} -o {output}"
13
14
shell:
    "mkdir -p qc/4unique; fastqc -o qc/4unique -t {threads} -f sam_mapped {input.sam}; mv qc/4unique/{params.prefix}_fastqc.html {output.html}; mv qc/4unique/{params.prefix}_fastqc.zip {output.zip}"
28
29
shell:
    "mkdir -p qc/3mapped; fastqc -o qc/3mapped -t {threads} -f sam_mapped {input.sam}; mv qc/3mapped/{params.prefix}_fastqc.html {output.html}; mv qc/3mapped/{params.prefix}_fastqc.zip {output.zip}"
43
44
shell:
    "mkdir -p qc/5removedrRNA; fastqc -o qc/5removedrRNA -t {threads} {input}; mv qc/5removedrRNA/{params.prefix}_fastqc.html {output.html}; mv qc/5removedrRNA/{params.prefix}_fastqc.zip {output.zip}"
55
56
57
58
59
60
61
62
63
64
65
shell:
    """
    mkdir -p qc/all;
    column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq)
    if [[ " ${{column3[@]}} " =~ "gene" ]];
    then
        featureCounts -T {threads} -t gene -g ID -a {input.annotation} -o {output.txt} {input.bam};
    else
        touch {output.txt};
    fi
    """
76
77
78
79
80
81
82
83
84
85
86
shell:
    """
    mkdir -p qc/trnainall;
    column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq)
    if [[ " ${{column3[@]}} " =~ "tRNA" ]];
    then
        featureCounts -T {threads} -t tRNA -g ID -a {input.annotation} -o {output.txt} {input.bam};
    else
        touch {output.txt};
    fi
    """
 97
 98
 99
100
101
102
103
104
105
106
107
shell:
    """
    mkdir -p qc/rrnainall;
    column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq)
    if [[ " ${{column3[@]}} " =~ "rRNA" ]];
    then
        featureCounts -T {threads} -t rRNA -g ID -a {input.annotation} -o {output.txt} {input.bam};
    else
        touch {output.txt};
    fi
    """
118
119
120
121
122
123
124
125
126
127
128
shell:
    """
    mkdir -p qc/rrnainallaligned;
    column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq)
    if [[ " ${{column3[@]}} " =~ "rRNA" ]];
    then
        featureCounts -T {threads} -t rRNA -g ID -a {input.annotation} -o {output.txt} {input.bam};
    else
        touch {output.txt};
    fi
    """
139
140
141
142
143
144
145
146
147
148
149
shell:
    """
    mkdir -p qc/rrnainuniquelyaligned;
    column3=$(cut -f3 auxiliary/unambigous_annotation.gff | sort | uniq)
    if [[ " ${{column3[@]}} " =~ "rRNA" ]];
    then
        featureCounts -T {threads} -t rRNA -g ID -a {input.annotation} -o {output.txt} {input.bam};
    else
        touch {output.txt};
    fi
    """
159
160
shell:
    "mkdir -p coverage; bedtools genomecov -ibam {input} -bg > {output}"
13
14
shell:
    "mkdir -p qc/1raw; fastqc -o qc/1raw -t {threads} {input.fastq}; mv qc/1raw/{params.prefix}_fastqc.html {output.html}; mv qc/1raw/{params.prefix}_fastqc.zip {output.zip}"
27
28
shell:
    "mkdir -p qc/2trimmed; fastqc -o qc/2trimmed -t {threads} {input}; mv qc/2trimmed/{params.prefix}_fastqc.html {output.html}; mv qc/2trimmed/{params.prefix}_fastqc.zip {output.zip}"
46
47
48
49
50
51
shell:
    """
    mkdir -p qc/1raw
    fastqc -o qc/1raw -t {threads} {input.fastq1}; mv qc/1raw/{params.prefix1}_fastqc.html {output.html1}; mv qc/1raw/{params.prefix1}_fastqc.zip {output.zip1}
    fastqc -o qc/1raw -t {threads} {input.fastq2}; mv qc/1raw/{params.prefix2}_fastqc.html {output.html2}; mv qc/1raw/{params.prefix2}_fastqc.zip {output.zip2}
    """
68
69
70
71
72
73
shell:
    """
    mkdir -p qc/2trimmed;
    fastqc -o qc/2trimmed -t {threads} {input}; mv qc/2trimmed/{params.prefix1}_fastqc.html {output.html1}; mv qc/2trimmed/{params.prefix1}_fastqc.zip {output.zip1}
    fastqc -o qc/2trimmed -t {threads} {input}; mv qc/2trimmed/{params.prefix2}_fastqc.html {output.html2}; mv qc/2trimmed/{params.prefix2}_fastqc.zip {output.zip2}
    """
114
115
shell:
    "export LC_ALL=en_US.utf8; export LANG=en_US.utf8; multiqc -f -d --exclude picard --exclude gatk -z -o {params.dir} qc/1raw qc/2trimmed qc/3mapped qc/4unique qc/5removedrRNA qc/all qc/trnainall qc/rrnainallaligned qc/rrnainuniquelyaligned qc/rrnainall trimmed  2> {log}"
13
14
15
16
17
18
19
20
21
22
shell:
    """
    if [ "{params.features}" == None ]; then
        features="";
    else
        features="--use_features {params.features}";
    fi;
    mkdir -p readcounts
    HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O --for_diff_expr -o {output} -t {threads} -a {input.annotation} ${{features}}
    """
34
35
36
37
38
shell:
    """
    mkdir -p readcounts
    HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O -o {output} -t {threads} -a {input.annotation}
    """
50
51
52
53
54
shell:
    """
    mkdir -p readcounts
    HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O -o {output} -t {threads} -a {input.annotation}
    """
66
67
68
69
70
shell:
    """
    mkdir -p readcounts
    HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O -o {output} -t {threads} -a {input.annotation}
    """
82
83
84
85
86
shell:
    """
    mkdir -p readcounts
    HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O --with_M --fraction -o {output} -t {threads} -a {input.annotation}
    """
 98
 99
100
101
102
shell:
    """
    mkdir -p auxiliary
    HRIBO/scripts/call_featurecounts.py -b {input.bam} -s 1 --with_O --fraction -o {output} -t {threads} -a {input.annotation}
    """
113
114
115
116
shell:
    """
    mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output}
    """
127
128
129
130
shell:
    """
    mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output}
    """
141
142
143
144
shell:
    """
    mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output}
    """
155
156
157
158
shell:
    """
    mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output}
    """
169
170
171
172
shell:
    """
    mkdir -p readcounts; HRIBO/scripts/map_reads_to_annotation.py -i {input.reads} -a {input.annotation} -o {output}
    """
184
185
shell:
    "mkdir -p readcounts; HRIBO/scripts/total_mapped_reads.py -b {input.bam} -m {output.mapped} -l {output.length}"
197
198
shell:
    "mkdir -p readcounts; HRIBO/scripts/total_mapped_reads.py -b {input.bam} -m {output.mapped} -l {output.length}"
210
211
shell:
    "mkdir -p readcounts; HRIBO/scripts/total_mapped_reads.py -b {input.bam} -m {output.mapped} -l {output.length}"
10
11
12
run:
    outputName = os.path.basename(input[0])
    shell("mkdir -p uniprotDB; mv {input} uniprotDB/{outputName}; gunzip uniprotDB/{outputName}")
34
35
shell:
    "mkdir -p reparation; if [ uniprotDB/uniprot_sprot.fasta.bak does not exist ]; then cp -p uniprotDB/uniprot_sprot.fasta uniprotDB/uniprot_sprot.fasta.bak; fi; mkdir -p {params.prefix}/tmp; reparation.pl -bam {input.bam} -g {input.genome} -gtf {input.gtf} -db {input.db} -out {params.prefix} -threads {threads}; if [ uniprotDB/uniprot_sprot.fasta does not exist ]; then cp -p uniprotDB/uniprot_sprot.fasta.bak uniprotDB/uniprot_sprot.fasta; fi;"
45
46
shell:
    "mkdir -p tracks; HRIBO/scripts/create_reparation_gff.py -c {wildcards.condition} -r {wildcards.replicate} -i {input} -o {output}"
56
57
shell:
    "mkdir -p tracks; HRIBO/scripts/concatenate_gff.py {input} -o {output}"
 9
10
11
12
shell:
    """
    mkdir -p annotation; awk -F'\\t' '$3 == "rRNA" || $3 == "tRNA"' {input.annotation} | awk -F'\\t' '{{print $1 FS $4 FS $5 FS "." FS "." FS $7}}' > {output.annotation}
    """
23
24
shell:
    "mkdir -p norRNA; mkdir -p mapuniqnorrna; bedtools intersect -v -a {input.mapuniq} -b {input.annotation} > {output.bam}"
18
19
shell:
    "mkdir -p trimlink; ln -s {params.inlink} {params.outlink};"
35
36
shell:
    "mkdir -p trimlink; ln -s {params.inlink1} {params.outlink1}; ln -s {params.inlink2} {params.outlink2};"
54
55
shell:
    "mkdir -p trimmed; cutadapt -j {threads} {params.adapter3} {params.adapter5} {params.quality} {params.filtering} -o {output.fastq} {input.fastq}"
74
75
shell:
    "mkdir -p trimmed; cutadapt -j {threads} {params.adapter3q} {params.adapter5q} {params.adapter3p} {params.adapter5p} {params.quality} {params.filtering} -o {output.fastq1} -p {output.fastq2} {input.fastq1} {input.fastq2}"
11
12
shell:
    "samtools faidx {rules.retrieveGenome.output}"
23
24
shell:
    "mkdir -p genomes; cut -f1,2 {input[0]} > genomes/sizes.genome"
34
35
shell:
    "mkdir -p genomes; HRIBO/scripts/reverse_complement.py --input_fasta_filepath genomes/genome.fa --output_fasta_filepath genomes/genome.rev.fa"
46
47
shell:
    "mkdir -p tracks; HRIBO/scripts/motif_to_gff.py --input_genome_fasta_filepath {input.fwd} --input_reverse_genome_fasta_filepath {input.rev} --motif_string ATG --output_gff3_filepath {output}"
58
59
shell:
    "mkdir -p tracks; HRIBO/scripts/motif_to_gff.py --input_genome_fasta_filepath {input.fwd} --input_reverse_genome_fasta_filepath {input.rev} --motif_string GTG,TTG,CTG --output_gff3_filepath {output}"
71
72
shell:
    "mkdir -p tracks; HRIBO/scripts/motif_to_gff.py --input_genome_fasta_filepath {input.fwd} --input_reverse_genome_fasta_filepath {input.rev} --motif_string TAG,TGA,TAA --output_gff3_filepath {output}"
83
84
shell:
    "mkdir -p tracks; HRIBO/scripts/motif_to_gff.py --input_genome_fasta_filepath {input.fwd} --input_reverse_genome_fasta_filepath {input.rev} --motif_string AAGG --output_gff3_filepath {output}"
98
99
shell:
    "samtools index -@ {threads} maplink/{params.prefix}"
112
113
shell:
    "samtools index -@ {threads} bammulti/{params.prefix}"
126
127
shell:
    "samtools index -@ {threads} rRNAbam/{params.prefix}"
148
149
shell:
    "mkdir -p totalmappedtracks; mkdir -p totalmappedtracks/raw; mkdir -p totalmappedtracks/mil; mkdir -p totalmappedtracks/min; HRIBO/scripts/mapping.py --mapping_style global --bam_path {input.bam} --wiggle_file_path totalmappedtracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};"
170
171
shell:
    "mkdir -p uniquemappedtracks; mkdir -p uniquemappedtracks/raw; mkdir -p uniquemappedtracks/mil; mkdir -p uniquemappedtracks/min; HRIBO/scripts/mapping.py --mapping_style global --bam_path {input.bam} --wiggle_file_path uniquemappedtracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};"
182
183
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
194
195
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
206
207
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
218
219
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
230
231
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
242
243
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
254
255
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
266
267
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
278
279
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
290
291
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
302
303
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
314
315
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
336
337
shell:
    "mkdir -p globaltracks; mkdir -p globaltracks/raw; mkdir -p globaltracks/mil; mkdir -p globaltracks/min; HRIBO/scripts/mapping.py --mapping_style global --bam_path {input.bam} --wiggle_file_path globaltracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};"
348
349
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
360
361
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
372
373
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
384
385
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
396
397
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
408
409
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
430
431
shell:
    "mkdir -p centeredtracks; mkdir -p centeredtracks/raw; mkdir -p centeredtracks/mil; mkdir -p centeredtracks/min; HRIBO/scripts/mapping.py --mapping_style centered --bam_path {input.bam} --wiggle_file_path centeredtracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};"
441
442
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
453
454
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
465
466
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
477
478
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
489
490
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
501
502
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
523
524
shell:
    "mkdir -p fiveprimetracks; mkdir -p fiveprimetracks/raw; mkdir -p fiveprimetracks/mil; mkdir -p fiveprimetracks/min; HRIBO/scripts/mapping.py --mapping_style first_base_only --bam_path {input.bam} --wiggle_file_path fiveprimetracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};"
535
536
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
547
548
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
559
560
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
571
572
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
583
584
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
595
596
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
617
618
shell:
    "mkdir -p threeprimetracks; mkdir -p threeprimetracks/raw; mkdir -p threeprimetracks/mil; mkdir -p threeprimetracks/min; HRIBO/scripts/mapping.py --mapping_style last_base_only --bam_path {input.bam} --wiggle_file_path threeprimetracks/ --no_of_aligned_reads_file_path {input.stats} --library_name {params.prefix};"
629
630
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
641
642
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
653
654
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
665
666
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
677
678
shell:
    "wigToBigWig {input.fwd} {input.genomeSize} {output.fwd}"
689
690
shell:
    "wigToBigWig {input.rev} {input.genomeSize} {output.rev}"
702
703
shell:
    "mkdir -p tracks; multiBamSummary bins --smartLabels --bamfiles {input.bam} -o {output} -p {threads};"
713
714
shell:
    "mkdir -p figures; plotCorrelation -in {input.npz} --corMethod spearman --skipZeros --plotTitle \"Spearman Correlation of Read Counts\" --whatToPlot heatmap --colorMap RdYlBu --plotNumbers -o {output.correlation} --outFileCorMatrix SpearmanCorr_readCounts.tab"
724
725
shell:
    "mkdir -p tracks; cat {input[0]} | grep -v '\tgene\t' > tracks/annotation-woGenes.gtf; gtf2bed < tracks/annotation-woGenes.gtf > tracks/annotation.bed"
736
737
shell:
    "mkdir -p tracks; cut -f1-6 {input[0]} > tracks/annotationNScore.bed6;  awk '{{$5=1 ; print ;}}' tracks/annotation.bed6 > tracks/annotation.bed6; bedToBigBed -type=bed6 -tab tracks/annotation.bed6 {input[1]} tracks/annotation.bb"
752
753
754
755
756
757
758
759
760
761
762
shell:
    """
    set +e
    mkdir -p tracks/color
    bigWigToWig {input.infwd} {params.unzippedfwd}
    bigWigToWig {input.inrev} {params.unzippedrev}
    sed -i '2s/^/track type=wiggle_0 visibility=full color=0,0,128 autoscale=on\\n/' {params.unzippedfwd}
    sed -i '2s/^/track type=wiggle_0 visibility=full color=0,130,200 autoscale=on\\n/' {params.unzippedrev}
    gzip -f {params.unzippedfwd}
    gzip -f {params.unzippedrev}
    """
774
775
776
777
778
779
780
781
782
783
784
shell:
    """
    set +e
    mkdir -p tracks/color
    cp {input.rbs} ./tracks/color/
    cp {input.start} ./tracks/color/
    cp {input.stop} ./tracks/color/
    sed -i '1s/^/##track type=wiggle_0 visibility=full color=145,30,180 autoscale=on\\n/' {output.outrbs}
    sed -i '1s/^/##track type=wiggle_0 visibility=full color=210,245,60 autoscale=on\\n/' {output.outstart}
    sed -i '1s/^/##track type=wiggle_0 visibility=full color=230,25,75 autoscale=on\\n/' {output.outstop}
    """
ShowHide 145 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/RickGelhausen/HRIBO
Name: hribo
Version: 1.7.1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Accessed: 56
Downloaded: 0
Copyright: Public Domain
License: GNU General Public License v3.0
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...