Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation, topic
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
This pipeline implements admixture mapping on the output of the AncInf pipeline (i.e. RFMix output) and/or association mapping on the output of the pre-imp
Code Snippets
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | suppressPackageStartupMessages(library(optparse)) suppressPackageStartupMessages(library(dplyr)) suppressPackageStartupMessages(library(magrittr)) suppressPackageStartupMessages(library(tidyverse)) ######################### Read in arguments ######################### option_list <- list( make_option(c("-r", "--relateds"), help="Output of PLINK IBS calculation; i.e. XXX.genome. Filtered for related individuals"), make_option(c("-o", "--output"), help="output_file")) opt <- parse_args(OptionParser(option_list=option_list)) ######################### Main Body ######################### rel = read.table(opt$relateds, header = T, comment.char = "") #Read in IBD results from PLINK rel %<>% mutate(ID1=paste(FID1,IID1,sep="]"), ID2=paste(FID2,IID2,sep="]")) #Concatenate IDs for ease of comparison multi_rel = rel %>% select(ID1,ID2) %>% pivot_longer(c(ID1,ID2)) %>% count(value) %>% filter(n>1) #Identify individuals with more than one related individual sing_rel = rel %>% filter(!(ID1 %in% multi_rel$value) & !(ID2 %in% multi_rel$value)) #Identify individuals with just a single related individual sing_exclude = sing_rel %>% mutate(choice = rbinom(n(),1,0.5)) %>% mutate(Choice=case_when(choice==1~as.character(ID1), TRUE~as.character(ID2))) %>% select(Choice) #Randomly select one of the two individuals from the single relative pairs. multi_exclude = multi_rel %>% mutate(Choice=value) %>% select(Choice) #Remove all individuals that are related to multiple others exclude = rbind(sing_exclude,multi_exclude) exclude %<>% separate(Choice, c("FID","IID"), "]") %>% select(FID,IID) write.table(exclude, opt$output, quote = F, row.names = F, col.names = F) |
147 148 | shell: "rm -r input; rm -r plink/*; rm -r data; rm *txt; rm accessory/*" |
158 159 160 161 162 163 164 | run: cmd = f"{CMD_PREFIX} plink --bfile {QUERY} --keep-allele-order --make-bed --maf {config['gwas']['maf']} --allow-no-sex --out plink/{BASE}_sub {plnk_rsrc(rule, plink_run_specs)}" if os.path.exists(config['samples']): cmd += f" --keep {config['samples']}" if os.path.exists(config['sites']): cmd += f" --exclude {config['sites']}" cmd += f"; sed -i 's/#//g' plink/{BASE}_sub.fam" #This removes the # character from any sample names which will trip up R in downstream steps shell(cmd) shell(f"{CMD_PREFIX} awk {{params.awk_string}}; {CMD_PREFIX} plink --bfile plink/{BASE}_sub --keep accessory/control_samples.txt --make-bed --out plink/{BASE}_sub_ctrls {plnk_rsrc(rule, plink_run_specs)}") |
170 171 172 173 174 175 176 177 178 179 | run: import pandas as pd shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_sub --test-missing --allow-no-sex --out plink/{BASE}_sub") #Use plink to calculate stats shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_sub --missing --allow-no-sex --out plink/{BASE}_sub") mbc = pd.read_table(f"plink/{BASE}_sub.missing", sep = r"\s+") #Read output of PLINK mis = pd.read_table(f"plink/{BASE}_sub.lmiss", sep = r"\s+") dat = pd.concat([mbc[mbc['P'] < float(config['gwas']['mbc_pval'])]['SNP'], mis[mis['F_MISS'] > float(config['gwas']['missingness'])]['SNP']]).drop_duplicates() #Extract SNPs to exclude from PLINK results dat.to_csv('accessory/missing_filter.txt', header=False,index=False) #Write SNPs to be filtered shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_sub --exclude accessory/missing_filter.txt --keep-allele-order --make-bed --out plink/{BASE}_sub_mflt") #Filter sites from full dataset shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_sub_ctrls --exclude accessory/missing_filter.txt --keep-allele-order --make-bed --out plink/{BASE}_sub_ctrls_mflt") #Filter sites from control |
208 209 | run: shell(f"{CMD_PREFIX} cat plink/{BASE}.genome.* | gzip > {BASE}.genome.gz; mv {BASE}.genome.gz plink") |
216 217 218 219 | run: shell(f"zcat {{input[1]}} | awk {{params.awk_string}}; " #Extract related individuals f"{CMD_PREFIX} Rscript scripts/filter_relateds.R -r accessory/related_IBS_ctrls.txt -o accessory/excluded_related_ctrls.txt; " #This script cycles through related pairs and keeps one f"{CMD_PREFIX} plink --bfile plink/{BASE}_LDprune_ctrls --remove accessory/excluded_related_ctrls.txt --keep-allele-order --make-bed --out plink/{BASE}_LDp_ctrls_IBDflt {plnk_rsrc(rule, plink_run_specs)}") |
239 240 241 | run: shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_LDp_ctrls_IBDflt --keep plink/DS/{BASE}_LDprune_ctrls_DS{{wildcards.round}}.txt --maf {config['sHWE']['maf']} " f"--geno 0.0 --keep-allele-order --make-bed --out plink/DS/{BASE}_LDprune_ctrls_DS{{wildcards.round}} {plnk_rsrc(rule, plink_run_specs)}") |
251 252 253 254 255 256 257 258 259 | run: with open('accessory/sHWE_downsampled_snps.txt', 'r') as tested: #count up the SNPs that were combined in previous rule. for num_tested, line in enumerate(tested): pass if num_tested < int(config['sHWE']['test_threshold']): #If not enough SNPs will be tested, print error message and DO NOT create output file. The documentatino of the pipeline encourages the user to adjust the parameters of the downsampling (more rounds, fewer individuals, lower threshold, etc.) print(f"Failed Checkpoint\nReason: Fewer than {config['sHWE']['test_threshold']} unique SNPs downsampled for sHWE (number downsampled = {num_tested})\n" f"Increase the value set for \'downsample_rounds\' entry in the \'sHWE\' section of the config file or reduce the sHWE 'test_threshold'") shell("cp accessory/sHWE_downsampled_snps.txt accessory/.sHWE_downsampled_snps.txt; rm accessory/sHWE_downsampled_snps.txt") #This will remove output of prior rule, which will prompt the re-creation of accessory/sHWE_downsampled_snps.txt. This is necessary to prompt the downsampling process when the user changes parameters of downsampling in config file. else: shell('touch accessory/.sHWE_downsample_pass.txt') #If pass, then create hidden file required by next rule |
272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 | run: max_entropy = -9 best_popnum = -9 with open("sHWE/entropy.txt", 'w') as ent_file: for file in input: #Loop over input files and calculate entropy of each if not os.stat(file).st_size == 0: #Make sure file is not empty popnum = file.split("_")[-1] #Extract population number from filename dat = pd.read_table(file, header=None) # read in vector of p-values from sHWE binned = pd.cut(dat.iloc[:,0], params.bins).value_counts(sort=False) # bin and count vector values entropy = scipy.stats.entropy(binned.iloc[1:,]) # Calculate entropy of bin counts, excluding first bin that corrresponds to bin containing true positives (i.e. sites truly out of HWE) ent_file.write(f"{popnum}\t{entropy}\n") if entropy > max_entropy: #Store current value if better than prior best result. best_popnum = popnum max_entropy = entropy else: print(f"sHWE calculation failed for {popnum}; debug or remove this value from sHWE -> pop_nums in the config.yml file") assert(max_entropy != -9 and best_popnum != -9) #Make sure that we have successfully identified an optimum population number with open("sHWE/.optimum_popnum.txt", 'w') as outfile: outfile.write(str(best_popnum)) #Write the optimum population number to hidden file. |
295 296 297 298 299 | run: popfile = open(f"sHWE/.optimum_popnum.txt", 'r') #Grab the estimated population number from the hidden file popnum = popfile.readline() popfile.close() shell(f"{CMD_PREFIX} Rscript {CODE}/run_sHWE.R -p plink/DS/{BASE}_LDprune_ctrls_DS{{wildcards.f}} -d {popnum} -t {{params.threshold}} -o sHWE/{BASE}_sHWE-DS{{wildcards.f}} || touch sHWE/{BASE}_sHWE-DS{{wildcards.f}}") |
306 307 308 309 310 311 | run: file_list = [x for x in glob.glob(f"plink/DS/{BASE}_LDprune_ctrls_DS*.bim") if os.path.exists(f"sHWE/{BASE}_sHWE-DS{int(x.strip('.bim').replace('DS', '').split('_')[-1])}.rmv.bim")] #Create file list of all successful runs of sHWE (the .rmv.bim file will be present if sHWE completed successfully) with open('accessory/.tested_file_list.txt', 'w') as tested: #Write file list to hidden file for file in file_list: tested.write(f"{file}\n") if file_list: shell(f"find plink/DS/ -name \'{BASE}_LDprune_ctrls_DS*.bim\' | grep -v -w accessory/.tested_file_list.txt | xargs cat | cut -d \" \" -f 2 | sort -T {os.getcwd()} | uniq > accessory/sHWE_tested_snps.txt") #This command prints markers from the input .bim files IFF they were successfully tested (in our file list) |
316 317 318 319 320 321 322 323 324 | run: with open('accessory/sHWE_tested_snps.txt', 'r') as tested: #Count up tested SNPs in file for num_tested, line in enumerate(tested): pass if num_tested < int(config['sHWE']['test_threshold']): print(f"Failed Checkpoint\nReason: Only {num_tested} markers were tested for sHWE, which is fewer than requested threshold ({config['sHWE']['test_threshold']})\n" f"Increase the value set for \'downsample_rounds\' entry in the \'sHWE\' section of the config file") shell("cp accessory/sHWE_tested_snps.txt accessory/.sHWE_tested_snps.txt; rm accessory/sHWE_tested_snps.txt; rm accessory/sHWE_downsampled_snps.txt") else: shell('touch accessory/.sHWE_pass.txt') |
344 345 346 347 348 349 350 | run: if config['IBD']['filter_relateds'] == 'true': shell(f"zcat {{input[1]}} | awk {{params.awk_string}}; {CMD_PREFIX} Rscript {CODE}/filter_relateds.R -r accessory/related_IBS_results.txt -o accessory/excluded_related_samples.txt; " f"{CMD_PREFIX} plink --bfile plink/{BASE}_LDp_sHWE --remove accessory/excluded_related_samples.txt --keep-allele-order --make-bed --out plink/{BASE}_LDp_sHWE_IBDflt {plnk_rsrc(rule, plink_run_specs)}") else: shell(f"zcat {{input[1]}} | awk {{params.awk_string}}; {CMD_PREFIX} Rscript {CODE}/filter_relateds.R -r accessory/related_IBS_results.txt -o accessory/excluded_related_samples.txt; " f"{CMD_PREFIX} plink --bfile plink/{BASE}_LDp_sHWE --keep-allele-order --make-bed --out plink/{BASE}_LDp_sHWE_IBDflt {plnk_rsrc(rule, plink_run_specs)}") |
355 356 357 358 359 360 361 362 363 364 365 | run: if config['match_controls'] == 'true': #Run control-matching and parse results into a format that PLINK can use to extract matched samples. shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_LDp_sHWE_IBDflt --allow-no-sex --read-genome {{input[1]}} --cluster cc --mcc 1 {config['control_number']} --out plink/{BASE} {plnk_rsrc(rule, plink_run_specs)}; " f"{CMD_PREFIX} " + "awk '{{if(NF!=2) for (i = 2; i <= NF; i++) print $i}}'" + f" plink/{BASE}.cluster1 " + f"| cut -d \"(\" -f 1 | sed 's/_/\\t/' > accessory/samples2.txt; " "awk '{{split($2,a,\"_\"); printf(\"%s\\n%s\\n\",$2,a[1])}}' accessory/samples2.txt | sed 's/#//g' > accessory/samples.txt") #The sed command is a holdover from dealing with pound signs in sample names in StJude data shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_sub_mflt --keep accessory/samples2.txt --make-bed --out input/{BASE}_CtlMat {plnk_rsrc(rule, plink_run_specs)}; " f"{CMD_PREFIX} plink --bfile plink/{BASE}_LDp_sHWE_IBDflt --keep accessory/samples2.txt --make-bed --out input/{BASE}_PCA {plnk_rsrc(rule, plink_run_specs)}") else: shell("{CMD_PREFIX} awk '{{split($2,a,\"_\"); printf(\"%s\\n%s\\n\",$2,a[1])}}' plink/" + BASE + "_LDp_sHWE_IBDflt.fam | sed 's/#//g' > accessory/samples.txt") shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_sub_mflt --make-bed --out input/{BASE}_CtlMat {plnk_rsrc(rule, plink_run_specs)}; " f"{CMD_PREFIX} plink --bfile plink/{BASE}_LDp_sHWE_IBDflt --make-bed --out input/{BASE}_PCA {plnk_rsrc(rule, plink_run_specs)}") |
393 394 | run: shell(f"head -n 1 {BASE}_chr1.admixmap.txt > header.txt; tail -q -n +2 {BASE}_chr*.admixmap.txt > body.txt; cat header.txt body.txt > {BASE}.admixmap.txt; rm header.txt; rm body.txt") |
399 400 401 402 403 | run: cmd = f"{CMD_PREFIX} Rscript {CODE}/genesis_prep.R -p plink/{BASE}_sub_mflt -k {{input[1]}} -n {config['gwas']['pc_num']} -o input/{BASE} -t {config['kinship_threshold']}" if os.path.exists(config['covariate_file']): #Not tested cmd += f" -C {config['covariate_file']}" shell(cmd) |
409 410 411 | run: cmd = f"{CMD_PREFIX} Rscript {CODE}/genesis_gwas.R -g {{input[0]}} -p {{input[1]}} -k {{input[2]}} -n {config['gwas']['pc_num']} -C {config['gwas']['other_predictors']} -O {os.getcwd()} -o {BASE}.genesis.txt -c {{threads}}" shell(cmd) |
416 417 418 419 420 421 422 423 424 425 426 | run: if os.path.exists(VCF): cmd = f"{CMD_PREFIX} plink2 --vcf {VCF} dosage=HDS --extract plink/{BASE}_sub_mflt.bim --const-fid 0 " if os.path.exists(config['samples']): cmd += f" --keep {config['samples']} " cmd += f" --make-pgen --out input/{BASE}_sub {plnk_rsrc(rule, plink_run_specs)}; " \ f"{CMD_PREFIX} plink2 --pfile input/{BASE}_sub --export vcf vcf-dosage=HDS-force --out input/{BASE}_sub;" \ f"{CMD_PREFIX} bgzip input/{BASE}_sub.vcf; {CMD_PREFIX} tabix input/{BASE}_sub.vcf.gz" print(cmd) shell(cmd) else: print(f"Did not find VCF file at provided path: {VCF}") |
443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 | run: with open("scripts/gather_report_data.sh", 'w') as report_cmds: report_line = f"echo \'rmarkdown::render(\"scripts/admixMap-report.Rmd\", output_file=\"{BASE}-Mapping-report.pdf\", " \ f"params=list(genesis_rslt=\"{BASE}.genesis.txt\", " \ f"fam_file=\"plink/{BASE}_sub_mflt.fam\", " \ f"input_directory=\"input/\", " \ f"blink_rslt=\"-9\", " \ f"gen_since_admix=\"{config['admixMapping']['gen_since_admixture']}\", " \ f"gwas_threshold=\"{config['gwas']['sig_threshold']}\", " \ f"ancestry_comps=\"{config['admixMapping']['ancestry_predictors']}\", " \ f"pruned_bim=\"plink/{BASE}_LDp_sHWE.bim\", " \ f"sHWE_entropy=\"sHWE/entropy.txt\", " \ f"sHWE_markers=\"accessory/sHWE_tested_snps.txt\", " \ f"ibd_ctrls=\"plink/{BASE}_LDp_ctrls_IBDflt.fam\", " \ f"eigenval=\"input/{BASE}.pcair.varprop\", " \ f"phenodat=\"input/{BASE}.genesis.pdat\", " \ f"kinplot=\"input/{BASE}.kinplot.png\", " \ f"mflt=\"accessory/missing_filter.txt\", " \ f"rulegraph_file=\"{BASE}-rulegraph.png\", " if f"{BASE}.admixmap.txt" in input: report_line += f"admixMap_rslt=\"{BASE}.admixmap.txt\", global_ancestry=\"{BASE}.globalancestry.txt\"," else: report_line += f"admixMap_rslt=\"-9\", global_ancestry=\"-9\"," shell(f"touch {BASE}.admixmap.sig.txt") if f"{BASE}.dos.genesis.txt" in input: report_line += f"dosage_rslt=\"{BASE}.dos.genesis.txt\"," else: report_line += f"dosage_rslt=\"-9\"," shell(f"touch {BASE}.dos.genesis.sig.txt") report_line += f"config_file=\"workflow/config.yml\"))\' | R --vanilla" report_cmds.write(report_line) shell(f"{CMD_PREFIX} sh scripts/gather_report_data.sh; mv scripts/{BASE}-Mapping-report.pdf {BASE}-Mapping-report.pdf; mv input/*kinplot.png figures") |
478 479 480 481 482 483 484 485 486 487 | run: shell(f"cut -f 1 {BASE}.genesis.sig.txt > {BASE}.genesis.sigID.txt") shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_sub_mflt --extract {BASE}.genesis.sigID.txt --keep-allele-order --freq case-control --allow-no-sex --recode vcf --out {BASE}.genesis.sig {plnk_rsrc(rule, plink_run_specs)}") cmd = f"{CMD_PREFIX} /usr/lib/jvm/java-8-openjdk-amd64/bin/java -Xmx8G -jar /snpEff/snpEff.jar ann -t GRCh37.75 {BASE}.genesis.sig.vcf | " \ "awk '{{split($8,a,\"|\"); split(a[1],b,\"=\"); print $1,$2,$4,$5,a[4],a[3],a[2],b[2]}}' | grep -v \"#\" > {BASE}.genesis.sig.preannt.txt" shell(cmd) for soft in ['genesis']: cmd = f"{CMD_PREFIX} Rscript {CODE}/merge_annt.R -i {BASE}.{soft}.txt -s {BASE}.{soft}.sig.preannt.txt -r {config['annotation']['rsIDs']} -f {BASE}.{soft}.sig.frq.cc -o {BASE}.{soft}.sig.annt.txt" if os.path.exists(config['annotation']['typed_key']): cmd += f" -t {config['annotation']['typed_key']}" shell(cmd) |
492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 | run: dat = pd.read_table(input[0], sep = r"\s+") dat = dat[dat['ANC.p.value'] < float(config['admixMapping']['sig_threshold'])] dat[['chm', 'spos', 'epos', 'ANC.estimate', 'ANC.std.error', 'ANC.statistic', 'ANC.p.value', 'anc']].to_csv(f"{BASE}.admixmap.tmp.bed", header=False, sep = "\t", index=False) gff = f"{os.getcwd()}/accessory/{os.path.basename(config['annotation']['gff'])}" genome = f"{os.getcwd()}/accessory/{os.path.basename(config['annotation']['genome'])}" #File containing chromosome sizes if config['annotation']['download_refs'] == 'true': shell(f" wget {config['annotation']['gff']} -O {gff}") shell(f" wget {config['annotation']['genome']} -O {genome}") elif os.path.exists(config['annotation']['gff']) and os.path.exists(config['annotation']['genome']): shell(f"cp {config['annotation']['gff']} {gff}") shell(f"cp {config['annotation']['genome']} {genome}") else: print("Could not find reference .gff and/or .genome file. Check config file that paths are correctly specified and/or download requested.") shell(f"{CMD_PREFIX} bedtools slop -i {BASE}.admixmap.tmp.bed -g {genome} -b {config['admixMapping']['annotation_buffer']} > {BASE}.admixmap.bed") cmd = f"{CMD_PREFIX} bedtools intersect -a {BASE}.admixmap.bed -b {gff} -loj | " \ "awk '{{if($11==\"gene\") print $0}}' | awk '{{split($NF,a,\";\"); $NF=\"\"; for(x in a) if(a[x] ~ /Name=/) print $0,a[x]}}' | sed 's/Name=//' > {BASE}.admixmap.preannt.txt" shell(cmd) shell(f"{CMD_PREFIX} Rscript {CODE}/admixAnnt.R -i input/ -p {BASE}.admixmap.preannt.txt -b {config['admixMapping']['annotation_buffer']} -o {{output}}") |
516 517 518 519 520 521 522 523 524 525 526 527 528 | run: if os.path.exists(config['conditional_analysis']['snp_list']): if not os.path.exists('conditional-analysis'): os.mkdir('conditional-analysis') chrom, pos = f"{wildcards.marker}".split("-") shell(f"{CMD_PREFIX} plink --bfile plink/{BASE}_sub_mflt --snp {wildcards.marker.replace('-',':')} --recodeAD --out conditional-analysis/{wildcards.marker} {plnk_rsrc(rule, plink_run_specs)} || true") if os.path.exists(f"conditional-analysis/{wildcards.marker}.raw"): shell(f"{CMD_PREFIX} Rscript {CODE}/genesis_conditional_analysis.R -a {chrom} -b {pos} -d {config['conditional_analysis']['distance']} " f"-m conditional-analysis/{wildcards.marker}.raw " f"-g {{input[0]}} -p {{input[1]}} -k {{input[2]}} -n {config['gwas']['pc_num']} -C {config['gwas']['other_predictors']} " f"-o conditional-analysis/{BASE}-{wildcards.marker}.genesis.txt") else: print(f"Did not find {wildcards.marker} in plink/{BASE}_sub_mflt") shell(f"touch conditional-analysis/{BASE}-{wildcards.marker}.genesis.txt") |
547 548 549 550 551 552 | run: if not os.path.exists('fine-mapping'): os.mkdir('fine-mapping') chrom, pos = f"{wildcards.marker}".split("-") shell(f"{CMD_PREFIX} plink2 --pfile input/{BASE}_sub --snp {wildcards.marker.replace('-',':')} --window {int(config['fine_mapping']['distance'])} " f"--export A --out fine-mapping/{wildcards.marker} {plnk_rsrc(rule, plink_run_specs)} || true") #NOTE: phenotype data is not specified in dosage input. Need to grab and merge from data in plink folder. |
Support
Do you know this workflow well? If so, you can
request seller status , and start supporting this workflow.
Created: 1yr ago
Updated: 1yr ago
Maitainers:
public
URL:
https://github.com/Monia234/admixMap
Name:
admixmap
Version:
1
Downloaded:
0
Copyright:
Public Domain
License:
None
- Future updates
Related Workflows

ENCODE pipeline for histone marks developed for the psychENCODE project
psychip pipeline is an improved version of the ENCODE pipeline for histone marks developed for the psychENCODE project.
The o...

Near-real time tracking of SARS-CoV-2 in Connecticut
Repository containing scripts to perform near-real time tracking of SARS-CoV-2 in Connecticut using genomic data. This pipeli...

snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...

ATLAS - Three commands to start analyzing your metagenome data
Metagenome-atlas is a easy-to-use metagenomic pipeline based on snakemake. It handles all steps from QC, Assembly, Binning, t...
raw sequence reads
Genome assembly
Annotation track
checkm2
gunc
prodigal
snakemake-wrapper-utils
MEGAHIT
Atlas
BBMap
Biopython
BioRuby
Bwa-mem2
cd-hit
CheckM
DAS
Diamond
eggNOG-mapper v2
MetaBAT 2
Minimap2
MMseqs
MultiQC
Pandas
Picard
pyfastx
SAMtools
SemiBin
Snakemake
SPAdes
SqueezeMeta
TADpole
VAMB
CONCOCT
ete3
gtdbtk
h5py
networkx
numpy
plotly
psutil
utils
metagenomics

RNA-seq workflow using STAR and DESeq2
This workflow performs a differential gene expression analysis with STAR and Deseq2. The usage of this workflow is described ...

This Snakemake pipeline implements the GATK best-practices workflow
This Snakemake pipeline implements the GATK best-practices workflow for calling small germline variants. The usage of thi...