Snakemake module containing different analyses provided by parabricks.
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation, topic
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
Snakemake module containing an array of steps provided by the parabricks tookit
:speech_balloon: Introduction
The module contains rules to align
.fastq
-files and call variants in the resulting
.bam
-files using
Clara Parabricks
.
To use this module a server with access to one or more
compatible NVIDIA GPUs
is required. Input data should be trimmed
.fastq
-files and we recommend to generate
these with
hydra-genetics/prealignment
for a smooth transition. In order to make
use of read group information, add machine, flowcell and library specifics to
units.tsv
.
:heavy_exclamation_mark: Dependencies
In order to use this module, the following dependencies are required:
:school_satchel: Preparations
Sample and unit data
Input data should be added to
samples.tsv
and
units.tsv
.
The following information need to be added to these files:
Column Id | Description |
---|---|
samples.tsv
|
|
sample | unique sample/patient id, one per row |
tumor_content | ratio of tumor cells to total cells |
units.tsv
|
|
sample |
same sample/patient id as in
samples.tsv
|
type | data type identifier (one letter), can be one of T umor, N ormal, R NA |
platform |
type of sequencing platform, e.g.
NovaSeq
|
machine |
specific machine id, e.g. NovaSeq instruments have
@Axxxxx
|
flowcell | identifer of flowcell used |
lane | flowcell lane number |
barcode |
sequence library barcode/index, connect forward and reverse indices by
+
, e.g.
ATGC+ATGC
|
fastq1/2 | absolute path to forward and reverse reads |
adapter | adapter sequences to be trimmed, separated by comma |
Reference data
Reference files should be specified in
config.yaml
in the section
reference
.
A
.fasta
-file is needed as well as a
.vcf
file containing known indels used
during the alignment process. For the RNA alignment part,
genome_dir
should
specify a directory containing reference files generated by
STAR
.
:rocket: Usage
To use this module in your workflow, follow the description in the
snakemake docs
.
Add the module to your
Snakefile
like so:
module parabricks:
snakefile:
github(
"hydra-genetics/parabricks",
path="workflow/Snakefile",
tag="1.0.0",
)
config:
config
use rule * from parabricks as parabricks_*
Compatibility
Latest:
- prealignment:v1.1.0
See COMPATIBLITY.md file for a complete list of module compatibility.
Output files
The following output files should be targeted via another rule:
File | Description |
---|---|
parabricks/pbrun_deepvariant/{sample}.vcf
|
variant call file generated by deepvariant |
parabricks/pbrun_fq2bam/{sample}_{type}.bam
|
alignment file generated by BWA-mem |
parabricks/pbrun_mutectcaller_t/{sample}_T.vcf
|
variant call file generated by Mutect2 using tumor-only mode |
parabricks/pbrun_mutectcaller_tn/{sample}.vcf
|
variant call file generated by Mutect2 using tumor/normal mode |
parabricks/pbrun_rna_fq2bam/{sample}_R.bam
|
alignment file generated by STAR |
:judge: Rule Graph
Code Snippets
39 40 41 42 43 44 45 46 | shell: "{params.cuda} pbrun deepvariant " "--ref {input.fasta} " "--in-bam {input.bam} " "--num-gpus {params.num_gpus} " "--out-variants {output.vcf} " "{params.extra} " "--tmp-dir parabricks/pbrun_deepvariant/{wildcards.sample} &> {log}" |
85 86 87 88 89 90 91 92 93 94 95 | shell: "{params.cuda} pbrun fq2bam " "--ref {input.fasta} " "--in-fq {params.in_fq} " "--knownSites {input.sites} " "--num-gpus {params.num_gpus} " "--out-bam {output.bam} " "--out-duplicate-metrics {output.metrics} " "--out-recal-file {output.recal} " "{params.extra} " "--tmp-dir parabricks/pbrun_fq2bam/{wildcards.sample}_{wildcards.type} &> {log}" |
131 132 133 134 135 136 137 138 139 140 | shell: "{params.cuda} pbrun mutectcaller " "--ref {input.fasta} " "--in-tumor-bam {input.bam_t} " "--tumor-name {wildcards.sample}_T " "--in-tumor-recal-file {input.recal_t} " "--num-gpus {params.num_gpus} " "--out-vcf {output.vcf} " "{params.extra} " "--tmp-dir parabricks/pbrun_mutectcaller_t/{wildcards.sample} &> {log}" |
179 180 181 182 183 184 185 186 187 188 189 190 191 | shell: "{params.cuda} pbrun mutectcaller " "--ref {input.fasta} " "--in-tumor-bam {input.bam_t} " "--tumor-name {wildcards.sample}_T " "--in-tumor-recal-file {input.recal_t} " "--in-normal-bam {input.bam_n} " "--normal-name {wildcards.sample}_N " "--in-normal-recal-file {input.recal_n} " "--num-gpus {params.num_gpus} " "--out-vcf {output.vcf} " "{params.extra} " "--tmp-dir parabricks/pbrun_mutectcaller_tn/{wildcards.sample} &> {log}" |
226 227 228 229 230 231 232 233 234 235 236 237 | shell: "{params.cuda} pbrun rna_fq2bam " "{params.extra} " "--genome-lib-dir {input.genome_dir} " "--in-fq {params.in_fq} " "--num-gpus {params.num_gpus} " "--output-dir parabricks/pbrun_rna_fq2bam/ " "--out-bam {output.bam} " "--out-prefix {wildcards.sample}_{wildcards.type} " "--tmp-dir parabricks/pbrun_rna_fq2bam/{wildcards.sample}_{wildcards.type} " "{params.extra} " "--logfile {log}" |
Support
- Future updates
Related Workflows





