A Snakemake workflow for variant calling using GATK4 best practices
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation, topic
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
Table of contents
-
Snakemake workflow: variant calling using the Genome Analysis Toolkit (GATK) best practices
Motivation
-
This repository contains a pipeline built with Snakemake for variant calling using Illumina-generated sequences and is based on the GATK best practices for variant calling using either hard or soft filters.
-
Additionally, this pipeline aims to reproduce a recently published pipeline that optimized the GATK4 variant calling pipeline for Plasmodium falciparum ( preprint ). However, this is not limited to P. falciparum and can be used for any organism of interest.
Pipeline sections
- The pipeline handles paired-end reads and below are the analysis sections in the Snakefile:
Step 1 - Compile List of Output Files
-
rule all
- gather all output files
Step 2 - Gather Genome Data
-
gather_genome_data
: aggregate genome data from the snpeff folder -
gatk_genome_dict
: create genome dictionary for gatk tools -
samtools_index
: index the genome fasta file -
bedops_gff2bed
: convert the genome annotation .gff to .bed file
Step 3 - Perform Fastq Quality Control
-
trim_reads
: trim adapters and low quality bases using trimmomatic or fastp
Step 4 - Map Reads to Genome
-
bwa_index
: generate bwa genome-index files for mapping reads -
bwa_mem
: map reads to genome, fixmate, convert .sam to .bam and finally remove artifacts -
mark_duplicates
: mark duplicate reads using gatk MarkDuplicatesSpark or Samblaster
Step 5 - Generate Mapping Quality Statistics
-
samtools_idxstats
: calculate alignment statistics based on the reference sequence -
samtools_flagstats
: calculates and summarizes various alignment statistics -
samtools_depth
: calculate the depth of coverage for each position in the genome -
gatk_insert_size_metrics
: collect insert size metrics -
gatk_alignment_summary_metrics
: generate a summary of alignment metrics from the BAM file
Step 6 - Perform Variant Calling
-
gatk_haplotypecaller
: call snps and indels via local re-assembly of haplotypes and generate gVCFs -
generate_sample_name_map
: generate a map of sample names and the respective vcf files -
gatk_genomics_db_import
: merge gVCFs into one genomic database -
gatk_genotype_gvcfs
: perform joint genotyping and generate the final VCF in which all samples have been jointly genotyped
Step 7 - Perform Variant Filtering (Hard or Soft)
-
bcftools_normalize
: normalize indels, left-align variants, split multiallelic sites into multiple rows and recover multiallelics from multiple rows -
hard_filter_variants:
-
gatk_split_variants
: separate snps and indels into separate vcf files -
gatk_filter_hard
: apply hard filters to snps and indels -
gatk_merge_vcfs
: merge snps and indels into one vcf file
-
-
soft_filter_variants:
-
gatk_vqsr_indels
: perform variant quality score recalibration to indels -
gatk_apply_vqsr_indels
: apply variant quality score recalibration to indels -
gatk_vqsr_snps
: perform variant quality score recalibration to snps -
gatk_apply_vqsr_snps
: apply variant quality score recalibration to snps
-
-
gatk_filter_pass
: filter out variants that do not pass the hard or soft filters
Step 8 - Annotatate Variants and Calculate Allele Frequencies
-
snpeff_annotate_variants
: variant annotation and functional effect prediction -
gatk_variants_to_table
: extract variant information into a table -
python
- calculate allele frequencies and transform the summary table from wide to long format
Project dependencies:
-
Conda - an open-source package management system and environment management system that runs on various platforms, including Windows, MacOS, Linux
-
Snakemake - a workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern specification language in python style.
Where to start
-
Install conda for your operating System ( the pipeline is currently tested on Linux and MacOS ):
-
Clone this project using the following command in your terminal:
-
git clone https://github.com/kevin-wamae/variant-calling-with-Snakemake-and-GATK.git
-
-
Type the following command in your terminal to navigate into the cloned directory using the command below. This will be the root directory of the project:
-
cd variant-calling-with-Snakemake-and-GATK
-
-
Note: All subsequent commands should be run from the root directory of this project. However, users can modify the scripts to their liking
Directory structure
-
Below is the default directory structure:
-
config/ - contains the Snakemake-configuration files
-
input/ - contains input files
-
bed/ - contains the bed files for specifying the intervals of interest
-
fastq/ - contains the FastQ files
-
known_sites/ - contains the positive-training dataset for variant filtering
-
-
output/ - contains numbered-output directories from the analysis
-
workflow/ - contains the Snakemake workflow files
-
envs/ - contains the Conda environment-configuration files
-
scripts/ - contains the scripts used in the pipeline
-
-
.
|-- config
|-- input
| |-- bed
| |-- fastq
| -- known_sites
|-- output
|-- workflow |-- envs |-- scripts
-
This pipeline uses
global_wildcards()
to match FastQ sample names and mate files in theinput/fastq/
directory, using the naming convention below:-
reads_R1.fastq.gz
= first mate -
reads_R2.fastq.gz
= second mate -
If you have a different naming convention (eg. this ), you can rename the FastQ files by executing the python script in the workflow/scripts/ directory:
-
python workflow/scripts/fastq_rename.py
-
-
Therefore, the user can deposit their FastQ files in the
input/fastq/
directory or edit theconfig/config.yaml
file to point to the directory with FastQ files and the pipeline will automatically match the sample names and mates files
-
-
The configuration file (
config/config.yaml
) specifies additional resources and can be modified to suit one's needs, such as:-
Input files
-
Output directories,
-
The option to choose between tools and methods, e.g.:
-
fastp
ortrimmomatic
for read trimming -
gatk MarkDuplicatesSpark
orsamblaster
for marking duplicates -
hard or soft filtering of variants
-
-
Other parameters, such as the number of threads to use
-
Running the analysis
After navigating into the root directory of the project, run the analysis by executing the following commands in your terminal to:
-
Create a conda analysis environment by running the command below in your terminal. This will create a conda environment named
variant-calling-gatk
and install Snakemake and SnpEff and Graphviz (for visualizing the workflow) in the environment. Note: This only needs to be done once.-
conda env create --file workflow/envs/environment.yaml
-
-
Activate the conda environment by running the command below in your terminal. Note: This needs to be done every time you exit and restart your terminal and want re-run this pipeline
-
conda activate variant-calling-gatk
-
-
Execute the shell script below to create the SnpEff database for variant annotation. This will download the P. falciparum genome data from PlasmoDB and create a database in the output/ directory. Note: This is an important step because the genome-FASTA and GFF files are required for read-mapping and variant calling. It can also be modified to suit one's needs such as download genome files for your organism of interest:
-
bash workflow/scripts/create_snpeff_db.sh
-
-
Finally, execute the whole Snakemake pipeline by running the following command in your terminal:
-
snakemake --use-conda --cores 2 --jobs 1
-
This will run the whole pipeline using a maximum of two cores and one job in parallel. The
--cores
flag specifies the number of cores to use for each job and the--jobs
flag specifies the number of jobs to run in parallel. -
If you want to run the pipeline using more resources, you can increase the number of cores and jobs. For example, to run the pipeline using 4 cores and 2 jobs in parallel, run the following command:
-
snakemake --use-conda --cores 4 --jobs 2
-
-
Additionally, you can change the
threads
entry inline 3
of the configuration file (config/config.yaml
) to specify the number of cores to use for each step in the pipeline.
-
-
Once the analysis is complete, look through output/ directory to view the results of the analysis
-
Summary statistics can be generated with stand alone scripts in the
workflow/scripts/
directory:-
To do this, create an conda environment with the following command:
-
conda env create --file workflow/envs/variant-calling-stats.yaml
-
activate the conda environment by running the following command:
conda activate variant-calling-stats
-
-
To generate a summary of the raw reads, run the following command and look through the
stats_1_raw_fastq.tsv
file in the project directory:-
python workflow/scripts/get_raw_fastq_stats.py
-
-
To generate a summary of the trimmed reads, run the following command and look through the
stats_2_trimmed_fastq.tsv
file:-
python workflow/scripts/get_trimmed_fastq_stats.py
-
-
To generate a summary of the mapped reads, run the following command and look through the
stats_3_mapped_reads.tsv
file:-
python workflow/scripts/get_mapped_reads_stats.py
-
-
To generate a summary of the variants called, run the following command and look through the
stats_4_variant_calling.tsv
file:-
bash workflow/scripts/get_variant_calling_stats.sh
-
-
Exit this conda environment by running the following command:
-
conda deactivate variant-calling-stats
-
-
-
Finally, you can deactivate the variant calling conda environment by running the following command:
-
conda deactivate variant-calling-gatk
-
Feedback and Issues
Report any issues or bugs by openning an issue here or contact me via email at wamaekevin[at]gmail.com
Code Snippets
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | import os import sys # Get the directory path from the command line arguments vcf_dir = sys.argv[1] tsv_file = sys.argv[2] # Get a list of all the VCF files in the directory vcf_files = [f for f in os.listdir(vcf_dir) if f.endswith('.vcf.gz')] # Create a list of tuples with the file paths and names file_list = [] for vcf_file in vcf_files: file_name = os.path.splitext(vcf_file)[0] file_path = os.path.join(vcf_dir, vcf_file) file_list.append((file_path, file_name)) # Sort the list by the second element of each tuple (the file name) file_list.sort(key=lambda x: x[1]) # Write the file names and paths to a TSV file with open(tsv_file, 'w') as tsv_file: for file_path, file_name in file_list: tsv_file.write(os.path.splitext(file_name)[ 0] + '\t' + file_path + '\n') |
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | output_file=$1 awk -F '\t' 'BEGIN { OFS = FS } NR == 1 { # Print the header with new column names for ANN fields for (i = 1; i <= 5; ++i) { printf("%s%s", $i, OFS) } printf("Allele%sAnnotation%sAnnotation_Impact%sGene_Name%sGene_ID%sFeature_Type%sFeature_ID%sTranscript_BioType%sRank%sHGVS.c%sHGVS.p%scDNA.pos%sCDS.pos%sAA.pos%sDistance%sErrors", OFS, OFS, OFS, OFS, OFS, OFS, OFS, OFS, OFS, OFS, OFS, OFS, OFS, OFS, OFS) for (i = 7; i <= NF; ++i) { printf("%s%s", OFS, $i) } printf("\n") next } { # Print the data with ANN fields split and without the ANN field for (i = 1; i <= 5; ++i) { printf("%s%s", $i, OFS) } split($6, ann, "|") for (i = 1; i <= 16; ++i) { printf("%s%s", ann[i], OFS) } for (i = 7; i <= NF; ++i) { printf("%s%s", $i, (i == NF) ? "\n" : OFS) } }' > /dev/stdout |
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | library(argparse) # For command line argument parsing library(data.table) # For fast data manipulation library(tidyverse) # For data wrangling and visualization # Define command line arguments parser <- argparse::ArgumentParser() parser$add_argument( "--input", type = "character", help = "Input file path") parser$add_argument("--output", type = "character", help = "Output file path") args <- parser$parse_args(commandArgs(TRUE)) # Read in the data using data.table::fread for fast reading of large files vcf <- data.table::fread(args$input, nThread = 4) # Reshape the data vcf <- vcf %>% mutate(across(23:ncol(.), as.character)) %>% # convert columns to be reshaped to character type mutate(row_id = row_number()) %>% # add a unique row identifier pivot_longer( cols = -c(CHROM:Errors, row_id), names_to = c("sample_name", "variable"), names_sep = "\\.") %>% pivot_wider(names_from = "variable", values_from = "value") %>% select(-row_id) %>% # drop the row_id column mutate( alt_allele_freq = if_else(str_detect(AD, ",0$"), 0, as.numeric(str_extract(AD, "\\d+$")) / (as.numeric(str_extract(AD, "^[^,]+")) + as.numeric(str_extract(AD, "\\d+$")))), alt_allele_freq = round(alt_allele_freq, 3) ) %>% filter(! alt_allele_freq == 0) %>% arrange(sample_name, POS) data.table::fwrite(vcf, args$output, nThread = 4) write_tsv(vcf, args$output) |
209 210 211 212 213 214 215 216 217 218 219 | run: shell( # cp - copy genome fasta file from snpeff database location """ cp -f {input.genome} {output.genome} """ ) shell( # cp - copy annotation file from snpeff database location """ cp -f {input.gff} {output.gff} """ ) |
235 236 237 238 239 240 241 | shell: """ gatk --java-options "{params.java_opts}" CreateSequenceDictionary \ --REFERENCE {input.genome} \ --OUTPUT {output.genome_dict} \ 2> {log} """ |
253 254 255 256 | shell: """ samtools faidx {input.genome} """ |
270 271 272 273 274 275 276 | shell: """ convert2bed \ --input=gff \ --output=bed < {input} |\ grep -e {params.feature} > {output} """ |
314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 | shell: """ echo "##############################################" echo "------- Running {params.trimmer} -------" echo "##############################################" if [[ "{params.trimmer}" == "Fastp" ]]; then fastp \ --thread {threads} \ {params.opts_fastp} \ --in1 {input.r1} \ --in2 {input.r2} \ --out1 {output.r1} \ --out2 {output.r2} \ --unpaired1 {output.r1_unpaired} \ --unpaired2 {output.r2_unpaired} \ --json {log.json} \ --html {log.html} \ 2> {log.log} else trimmomatic PE \ -threads {threads} \ {input.r1} {input.r2} \ {output.r1} {output.r1_unpaired} \ {output.r2} {output.r2_unpaired} \ {params.opts_trimmomatic} \ 2> {log.log} fi """ |
361 362 363 364 365 366 367 368 | shell: """ echo "##############################################" echo "----------- Running BWA Index ----------" echo "##############################################" bwa index -p {output.index} {input.genome} 2> {log} """ |
398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 | shell: """ echo "#####################################################" echo "Running BWA-> Fixmate-> SamFormatConverter-> CleanSam" echo "#####################################################" bwa mem \ -t {threads} \ {params.extra_bwa} \ {params.read_groups} \ {input.idx} \ {input.reads} \ 2> {log.bwa} |\ samtools fixmate \ --threads {threads} \ {params.extra_fixmate} \ --output-fmt sam \ /dev/stdin \ /dev/stdout \ 2> {log.fixmate} |\ gatk --java-options "{params.java_opts}" SamFormatConverter \ --INPUT /dev/stdin \ --OUTPUT /dev/stdout \ 2> {log.sam2bam} |\ gatk --java-options "{params.java_opts}" CleanSam \ -R {input.genome} \ -I /dev/stdin \ -O {output.bam} \ 2> {log.cleansam} """ |
451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 | shell: """ echo "##############################################" echo "--- Running {params.mark_duplicate_tool} ---" echo "##############################################" if [ "{params.mark_duplicate_tool}" == "MarkDuplicatesSpark" ]; then gatk --java-options "{params.java_opts}" MarkDuplicatesSpark \ --spark-master local[{threads}] \ -I {input.bam} \ -O {output.bam} \ {params.extra_gatk} \ 2> {log} elif [ "{params.mark_duplicate_tool}" == "Samblaster" ]; then samtools view \ -h {input.bam} |\ samblaster \ {params.extra_samblaster} \ 2> {log} |\ samtools sort \ -@ {threads} \ -o {output.bam} \ 2>> {log} samtools index {output.bam} else echo "Unsupported mark duplicate tool selected: {params.mark_duplicate_tool}" exit 1 fi """ |
499 500 501 502 503 504 505 506 | shell: """ echo "##############################################" echo "------ Running Samtools IdxStats ------" echo "##############################################" samtools idxstats {input.bam} > {output.idxstats} """ |
521 522 523 524 525 526 527 528 | shell: """ echo "##############################################" echo "------ Running Samtools Flagstat ------" echo "##############################################" samtools flagstat {input.bam} > {output.flagstat} """ |
543 544 545 546 547 548 549 550 | shell: """ echo "##############################################" echo "------ Running Samtools Depth ------" echo "##############################################" samtools depth {input.bam} > {output.depth} """ |
573 574 575 576 577 578 579 580 581 582 583 584 585 586 | shell: """ echo "##############################################" echo "-- Running GATK CollectInsertSizeMetrics --" echo "##############################################" gatk --java-options "{params.java_opts}" CollectInsertSizeMetrics \ {params.extra} \ -R {input.genome} \ -I {input.bam} \ -O {output.metrics} \ -H {output.histogram} \ 2> {log} """ |
609 610 611 612 613 614 615 616 617 618 619 620 621 622 | shell: """ echo "##############################################" echo " Running GATK CollectAlignmentSummaryMetrics " echo "##############################################" gatk --java-options "{params.java_opts}" CollectAlignmentSummaryMetrics \ {params.extra} \ -R {input.genome} \ -I {input.bam} \ -O {output.metrics} \ -H {output.histogram} \ 2> {log} """ |
650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 | shell: """ echo "##############################################" echo "----- Running GATK HaplotypeCaller -----" echo "##############################################" gatk --java-options "{params.java_opts}" HaplotypeCaller \ --native-pair-hmm-threads {threads} \ {params.extra} \ -R {input.genome} \ -L {input.intervals} \ -I {input.bam} \ -O {output.vcf} \ 2> {log} """ |
677 678 679 680 681 682 683 684 685 686 | shell: """ echo "##############################################" echo "------ Generating VCF-Sample Map ------" echo "##############################################" python workflow/scripts/sample_vcf_map.py \ {params.directory} \ {output} """ |
707 708 709 710 711 712 713 714 715 716 717 718 719 720 | shell: """ echo "##############################################" echo "---- Running GATK GenomicsDBImport ----" echo "##############################################" gatk --java-options "{params.java_opts}" GenomicsDBImport \ {params.extra} \ --reader-threads {threads} \ --genomicsdb-workspace-path {output.dir} \ --sample-name-map {input.sample_map} \ -L {input.intervals} \ 2> {log} """ |
739 740 741 742 743 744 745 746 747 748 749 750 751 752 | shell: """ echo "##############################################" echo "----- Running GATK GenotypeGVCFs -----" echo "##############################################" gatk --java-options "{params.java_opts}" GenotypeGVCFs \ {params.extra} \ -R {input.genome} \ -V gendb://{input.db} \ -L {input.intervals} \ -O {output.vcf} \ 2> {log} """ |
776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 | shell: """ echo "##############################################" echo "--- Running BCFtools Normalize & Index VCF ---" echo "##############################################" bcftools norm \ {params.normalise} \ --fasta-ref {input.genome} \ {input.vcf} \ 2> {log} |\ bcftools annotate \ {params.annotate} \ 2>> {log} |\ bcftools view \ {params.view} \ --output-type z \ --output-file {output.vcf} \ 2>> {log} tabix -p vcf {output.vcf} """ |
818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 | shell: """ echo "##############################################" echo "----- Running GATK SplitVariants -----" echo "##############################################" gatk --java-options "{params.java_opts}" SelectVariants \ -R {input.genome} \ -V {input.vcf} \ -O {output.snps} \ --select-type-to-include SNP \ 2> {log.snps} gatk --java-options "{params.java_opts}" SelectVariants \ -R {input.genome} \ -V {input.vcf} \ -O {output.indels} \ --select-type-to-include INDEL \ 2> {log.indels} """ |
862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 | shell: """ echo "##############################################" echo "--- Running GATK VariantFiltration (Hard) ---" echo "##############################################" gatk --java-options "{params.java_opts}" VariantFiltration \ {params.extra_snps} \ -R {input.genome} \ -V {input.snps} \ -O {output.snps} \ 2> {log.snps} gatk --java-options "{params.java_opts}" VariantFiltration \ {params.extra_indels} \ -R {input.genome} \ -V {input.indels} \ -O {output.indels} \ 2> {log.indels} """ |
898 899 900 901 902 903 904 905 906 907 908 909 | shell: """ echo "##############################################" echo "--------- Running GATK MergeVcfs ---------" echo "##############################################" gatk --java-options "{params.java_opts}" MergeVcfs \ -I {input.snps} \ -I {input.indels} \ -O {output.vcf} \ 2> {log} """ |
932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 | shell: """ echo "##############################################" echo "- Running GATK VariantRecalibrator (INDELS) -" echo "##############################################" gatk --java-options "{params.java_opts}" VariantRecalibrator \ {params.extra} \ -R {input.genome} \ -V {input.vcf} \ -O {output.recal} \ -resource:{params.resource} {params.known_sites} \ --tranches-file {output.tranches} \ --rscript-file {output.plots} \ 2> {log} """ |
971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 | shell: """ echo "##############################################" echo "---- Running GATK ApplyVQSR (INDELS) -----" echo "##############################################" gatk --java-options "{params.java_opts}" ApplyVQSR \ {params.extra} \ -R {input.genome} \ -V {input.vcf} \ -O {output.vcf} \ --tranches-file {input.tranches} \ --recal-file {input.recal} \ 2> {log} """ |
1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 | shell: """ echo "##############################################" echo "-- Running GATK VariantRecalibrator (SNPs) --" echo "##############################################" gatk --java-options "{params.java_opts}" VariantRecalibrator \ {params.extra} \ -R {input.genome} \ -V {input.vcf} \ -O {output.recal} \ -resource:{params.resource} {params.known_sites} \ --tranches-file {output.tranches} \ --rscript-file {output.plots} \ 2> {log} """ |
1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 | shell: """ echo "##############################################" echo "------ Running GATK ApplyVQSR (SNPs) -------" echo "##############################################" gatk --java-options "{params.java_opts}" ApplyVQSR \ {params.extra} \ -R {input.genome} \ -V {input.vcf} \ -O {output.vcf} \ --tranches-file {input.tranches} \ --recal-file {input.recal} \ 2> {log} """ |
1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 | shell: """ echo "##############################################" echo "------- Running GATK FilterVcfPass -------" echo "##############################################" gatk --java-options "{params.java_opts}" SelectVariants \ -V {input.vcf} \ -O {output.vcf} \ --exclude-filtered \ 2> {log} """ |
1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 | shell: """ echo "##############################################" echo "-------- Running SnpEff Annotate --------" echo "##############################################" snpEff ann \ {params.extra} \ -config {params.config} \ {params.database} \ {input.vcf} | bgzip -c > {output.vcf} tabix -p vcf {output.vcf} """ |
1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 | shell: """ echo "##############################################" echo "------ Running GATK VariantsToTable ------" echo "##############################################" gatk --java-options "{params.java_opts}" VariantsToTable \ -V {input.vcf} \ --fields CHROM \ --fields POS \ --fields REF \ --fields ALT \ --fields TYPE \ --fields ANN \ --genotype-fields GT \ --genotype-fields AD \ --genotype-fields DP \ --genotype-fields GQ \ --genotype-fields PGT \ --genotype-fields PID \ --genotype-fields PL \ --genotype-fields PS \ -O /dev/stdout \ 2> {log} |\ bash workflow/scripts/split_annot_column.sh > {output.variants} """ |
1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 | shell: """ echo "##############################################" echo "-------- Running R AlleleFrequency --------" echo "##############################################" Rscript workflow/scripts/vcf_allele_frequency.R \ --input {input.variants} \ --output {output.variants} \ 2> {log} """ |
Support
- Future updates
Related Workflows





