A Snakemake pipeline to go from raw .subreads.bam PacBio Iso-Seq to assembled mRNA isoforms (FASTA format)

public public 1yr ago Version: v0.1.0 0 bookmarks

Snakemake workflow: PacBio Iso-Seq processing pipeline

Introduction

A Snakemake workflow for processing PacBio raw subreads.bam into polished mRNA isoforms in FASTA format.
Optionnally, long assembled mRNAs can be al

Code Snippets

 95
 96
 97
 98
 99
100
shell:
    "ccs "
    "--min-rq {params.min_accuracy} "       # minimum predicted accuracy 
    "--report-file {params.report_file} "
    "--num-threads {threads} "   
    "{input.subreads} {output.css}"
116
117
118
119
120
121
122
123
124
shell:
    "lima "
    "--isoseq "
    "--peek-guess "                     # remove spurious false positive
    "--num-threads {threads} "
    "--log-file  {params.report_file} " # Split output by resolved barcode pair name.
    "{input} "
    "{params.barcodes} "
    "{params.temp_filename}"
139
140
141
142
143
144
145
146
shell:
    "isoseq3 refine "
    "--require-polya "
    "--num-threads {threads} "
    " --log-file {params.report_file} "
    "{input} "
    "{params.barcodes} "
    "{output} "
160
161
162
163
164
165
166
shell:
    "isoseq3 cluster "
    "--num-threads {threads} "
    " --log-file {params.report_file} "
    "--verbose "
    "{input} "
    "{output} "
181
182
shell:
    "samtools fasta -o {output} {input}"  
195
196
197
198
199
200
shell:
    "pbmm2 index "
    "--num-threads {threads} "
    "--preset ISOSEQ "
    "{input} "
    "{output}"
213
214
215
216
217
218
219
220
shell:
    "pbmm2 align "
    "--preset ISOSEQ " 
    "--sort "
    "--num-threads {threads} "
    "{input.genome_index} "
    "{input.transcripts} "
    "{output}"
238
239
240
241
242
243
244
245
246
shell:
    "isoseq3 collapse "
    "--min-aln-coverage {params.min_aln_coverage} "
    "--min-aln-identity {params.min_aln_identity} "
    "--max-fuzzy-junction {params.max_fuzzy_junction} "
    "{input.aln} "
    "{input.css} "
    "{output.gff};"
    "mv {params.temp_fasta} {output.fasta} "
257
258
shell:
    "gffread {input} -T -o {output}"
ShowHide 4 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/SilkeAllmannLab/pacbio_snakemake
Name: pacbio_snakemake
Version: v0.1.0
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: MIT License
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...