A fully reproducible and state-of-the-art ancient DNA analysis pipeline

public public 1yr ago Version: 2.4.7 0 bookmarks

A fully reproducible and state-of-the-art ancient DNA analysis pipeline .

Introduction

nf-core/eager is a scalable and reproducible bioinformatics best-practise processing pipeline for genomic NGS sequencing data, with a focus on ancient DNA (aDNA) data. It is ideal for the (palaeo)genomic analysis of humans, animals, plants, microbes and even microbiomes.

The pipeline is built using Nextflow , a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible. The pipeline pre-processes raw data from FASTQ inputs, or preprocessed BAM inputs. It can align reads and performs extensive general NGS and aDNA specific quality-control on the results. It comes with docker, singularity or conda containers making installation trivial and results highly reproducible.

nf-core/eager schematic workflow

Quick Start

  1. Install nextflow ( >=20.07.1 )

  2. Install any of Docker , Singularity , Podman , Shifter or Charliecloud for full pipeline reproducibility (please only use Conda as a last resort; see docs )

  3. Download the pipeline and test it on a minimal dataset with a single command:

    nextflow run nf-core/eager -profile test,<docker/singularity/podman/shifter/charliecloud/conda/institute>
    

    Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use -profile <institute> in your command. This will enable either docker or singularity and set the appropriate execution settings for your local compute environment.

  4. Start running your own analysis!

    nextflow run nf-core/eager -profile <docker/singularity/podman/conda/institute> --input '*_R{1,2}.fastq.gz' --fasta '<your_reference>.fasta'
    
  5. Once your run has completed successfully, clean up the intermediate files.

    nextflow clean -f -k
    

    See usage docs for all of the available options when running the pipeline.

N.B. You can see an overview of the run in the MultiQC report located at ./results/MultiQC/multiqc_report.html

Modifications to the default pipeline are easily made using various options as described in the documentation.

Pipeline Summary

Default Steps

By default the pipeline currently performs the following:

  • Create reference genome indices for mapping ( bwa , samtools , and picard )

  • Sequencing quality control ( FastQC )

  • Sequencing adapter removal, paired-end data merging ( AdapterRemoval )

  • Read mapping to reference using ( bwa aln , bwa mem , CircularMapper , or bowtie2 )

  • Post-mapping processing, statistics and conversion to bam ( samtools )

  • Ancient DNA C-to-T damage pattern visualisation ( DamageProfiler )

  • PCR duplicate removal ( DeDup or MarkDuplicates )

  • Post-mapping statistics and BAM quality control ( Qualimap )

  • Library Complexity Estimation ( preseq )

  • Overall pipeline statistics summaries ( MultiQC )

Additional Steps

Additional functionality contained by the pipeline currently includes:

Input

  • Automatic merging of complex sequencing setups (e.g. multiple lanes, sequencing configurations, library types)

Preprocessing

  • Illumina two-coloured sequencer poly-G tail removal ( fastp )

  • Post-AdapterRemoval trimming of FASTQ files prior mapping ( fastp )

  • Automatic conversion of unmapped reads to FASTQ ( samtools )

  • Host DNA (mapped reads) stripping from input FASTQ files (for sensitive samples)

aDNA Damage manipulation

  • Damage removal/clipping for UDG+/UDG-half treatment protocols ( BamUtil )

  • Damaged reads extraction and assessment ( PMDTools )

  • Nuclear DNA contamination estimation of human samples ( angsd )

Genotyping

  • Creation of VCF genotyping files ( GATK UnifiedGenotyper , GATK HaplotypeCaller and FreeBayes )

  • Creation of EIGENSTRAT genotyping files ( pileupCaller )

  • Creation of Genotype Likelihood files ( angsd )

  • Consensus sequence FASTA creation ( VCF2Genome )

  • SNP Table generation ( MultiVCFAnalyzer )

Biological Information

  • Mitochondrial to Nuclear read ratio calculation ( MtNucRatioCalculator )

  • Statistical sex determination of human individuals ( Sex.DetERRmine )

Metagenomic Screening

  • Low-sequenced complexity filtering ( BBduk )

  • Taxonomic binner with alignment ( MALT )

  • Taxonomic binner without alignment ( Kraken2 )

  • aDNA characteristic screening of taxonomically binned data from MALT ( MaltExtract )

Functionality Overview

A graphical overview of suggested routes through the pipeline depending on context can be seen below.

nf-core/eager metro map

Documentation

The nf-core/eager pipeline comes with documentation about the pipeline: usage and output .

  1. Nextflow installation

  2. Pipeline configuration

  3. Running the pipeline

    • This includes tutorials, FAQs, and troubleshooting instructions
  4. Output and how to interpret the results

Credits

This pipeline was mostly written by Alexander Peltzer ( apeltzer ) and James A. Fellows Yates , with contributions from Stephen Clayton , Thiseas C. Lamnidis , Maxime Borry , Zandra Fagernäs , Aida Andrades Valtueña and Maxime Garcia and the nf-core community.

We thank the following people for their extensive assistance in the development of this pipeline:

Authors (alphabetical)

Additional Contributors (alphabetical)

Those who have provided conceptual guidance, suggestions, bug reports etc.

If you've contributed and you're missing in here, please let us know and we will add you in of course!

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines .

For further information or help, don't hesitate to get in touch on the Slack #eager channel (you can join with this invite ).

Citations

If you use nf-core/eager for your analysis, please cite the eager preprint as follows:

Fellows Yates JA, Lamnidis TC, Borry M, Valtueña Andrades A, Fagernäs Z, Clayton S, Garcia MU, Neukamm J, Peltzer A. 2021. Reproducible, portable, and efficient ancient genome reconstruction with nf-core/eager. PeerJ 9:e10947. DOI: 10.7717/peerj.10947 .

You can cite the eager zenodo record for a specific version using the following doi: 10.5281/zenodo.3698082

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x .

In addition, references of tools and data used in this pipeline are as follows:

Data References

This repository uses test data from the following studies:

  • Fellows Yates, J. A. et al. (2017) ‘Central European Woolly Mammoth Population Dynamics: Insights from Late Pleistocene Mitochondrial Genomes’, Scientific reports, 7(1), p. 17714. doi: 10.1038/s41598-017-17723-1 .

  • Gamba, C. et al. (2014) ‘Genome flux and stasis in a five millennium transect of European prehistory’, Nature communications, 5, p. 5257. doi: 10.1038/ncomms6257 .

  • Star, B. et al. (2017) ‘Ancient DNA reveals the Arctic origin of Viking Age cod from Haithabu, Germany’, Proceedings of the National Academy of Sciences of the United States of America, 114(34), pp. 9152–9157. doi: 10.1073/pnas.1710186114 .

  • de Barros Damgaard, P. et al. (2018). '137 ancient human genomes from across the Eurasian steppes.', Nature, 557(7705), 369–374. doi: 10.1038/s41586-018-0094-2

Code Snippets

193
194
195
"""
pigz -f -d -p ${task.cpus} $zipped_fasta
"""
NextFlow From line 193 of master/main.nf
504
505
506
507
"""
bwa index $fasta
mkdir BWAIndex && mv ${fasta}* BWAIndex
"""
533
534
535
536
"""
bowtie2-build --threads ${task.cpus} $fasta $fasta
mkdir BT2Index && mv ${fasta}* BT2Index
"""
575
576
577
"""
samtools faidx $fasta
"""
615
616
617
"""
picard -Xmx${task.memory.toMega()}M CreateSequenceDictionary R=$fasta O="${fasta.baseName}.dict"
"""
643
644
645
"""
samtools fastq -t ${bam} | pigz -p ${task.cpus} > ${base}.converted.fastq.gz
""" 
664
665
666
"""
samtools index ${bam} ${size}
"""
699
700
701
702
703
"""
fastqc -t ${task.cpus} -q $r1 $r2
rename 's/_fastqc\\.zip\$/_raw_fastqc.zip/' *_fastqc.zip
rename 's/_fastqc\\.html\$/_raw_fastqc.html/' *_fastqc.html
"""
705
706
707
708
709
"""
fastqc -t ${task.cpus} -q $r1
rename 's/_fastqc\\.zip\$/_raw_fastqc.zip/' *_fastqc.zip
rename 's/_fastqc\\.html\$/_raw_fastqc.html/' *_fastqc.html
"""
746
747
748
"""
fastp --in1 ${r1} --out1 "${r1.baseName}.pG.fq.gz" -A -g --poly_g_min_len "${params.complexity_filter_poly_g_min}" -Q -L -w ${task.cpus} --json "${r1.baseName}"_L${lane}_fastp.json 
"""
750
751
752
"""
fastp --in1 ${r1} --in2 ${r2} --out1 "${r1.baseName}.pG.fq.gz" --out2 "${r2.baseName}.pG.fq.gz" -A -g --poly_g_min_len "${params.complexity_filter_poly_g_min}" -Q -L -w ${task.cpus} --json "${libraryid}"_L${lane}_polyg_fastp.json 
"""
820
821
822
823
824
825
826
827
828
829
830
831
832
"""
mkdir -p output

AdapterRemoval --file1 ${r1} --file2 ${r2} --basename ${base}.pe --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} --collapse ${preserve5p} --trimns --trimqualities ${adapters_to_remove} --minlength ${params.clip_readlength} --minquality ${params.clip_min_read_quality} --minadapteroverlap ${params.min_adap_overlap}

cat *.collapsed.gz *.collapsed.truncated.gz *.singleton.truncated.gz *.pair1.truncated.gz *.pair2.truncated.gz > output/${base}.pe.combined.tmp.fq.gz

mv *.settings output/

## Add R_ and L_ for unmerged reads for DeDup compatibility
AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz

"""
835
836
837
838
839
840
841
842
843
844
845
846
847
"""
mkdir -p output

AdapterRemoval --file1 ${r1} --file2 ${r2} --basename ${base}.pe --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} --collapse ${preserve5p} --trimns --trimqualities ${adapters_to_remove} --minlength ${params.clip_readlength} --minquality ${params.clip_min_read_quality} --minadapteroverlap ${params.min_adap_overlap}

cat *.collapsed.gz *.singleton.truncated.gz *.pair1.truncated.gz *.pair2.truncated.gz > output/${base}.pe.combined.tmp.fq.gz

mv *.settings output/

## Add R_ and L_ for unmerged reads for DeDup compatibility
AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz

"""
850
851
852
853
854
855
856
857
858
859
860
"""
mkdir -p output
AdapterRemoval --file1 ${r1} --file2 ${r2} --basename ${base}.pe  --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} --collapse ${preserve5p} --trimns --trimqualities ${adapters_to_remove} --minlength ${params.clip_readlength} --minquality ${params.clip_min_read_quality} --minadapteroverlap ${params.min_adap_overlap}

cat *.collapsed.gz *.collapsed.truncated.gz > output/${base}.pe.combined.tmp.fq.gz

## Add R_ and L_ for unmerged reads for DeDup compatibility
AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz

mv *.settings output/
"""
863
864
865
866
867
868
869
870
871
872
873
"""
mkdir -p output
AdapterRemoval --file1 ${r1} --file2 ${r2} --basename ${base}.pe  --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} --collapse ${preserve5p} --trimns --trimqualities ${adapters_to_remove} --minlength ${params.clip_readlength} --minquality ${params.clip_min_read_quality} --minadapteroverlap ${params.min_adap_overlap}

cat *.collapsed.gz > output/${base}.pe.combined.tmp.fq.gz

## Add R_ and L_ for unmerged reads for DeDup compatibility
AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz

mv *.settings output/
"""
877
878
879
880
881
882
883
884
885
886
887
"""
mkdir -p output
AdapterRemoval --file1 ${r1} --file2 ${r2} --basename ${base}.pe --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} --collapse ${preserve5p} --adapter1 "" --adapter2 ""

cat *.collapsed.gz *.pair1.truncated.gz *.pair2.truncated.gz > output/${base}.pe.combined.tmp.fq.gz

## Add R_ and L_ for unmerged reads for DeDup compatibility
AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz

mv *.settings output/
"""
891
892
893
894
895
896
897
898
899
900
901
"""
mkdir -p output
AdapterRemoval --file1 ${r1} --file2 ${r2} --basename ${base}.pe --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} --collapse ${preserve5p}  --adapter1 "" --adapter2 ""

cat *.collapsed.gz > output/${base}.pe.combined.tmp.fq.gz

## Add R_ and L_ for unmerged reads for DeDup compatibility
AdapterRemovalFixPrefix -Xmx${task.memory.toGiga()}g output/${base}.pe.combined.tmp.fq.gz | pigz -p ${task.cpus - 1} > output/${base}.pe.combined.fq.gz

mv *.settings output/
"""
904
905
906
907
908
909
"""
mkdir -p output
AdapterRemoval --file1 ${r1} --file2 ${r2} --basename ${base}.pe --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} ${preserve5p} --trimns --trimqualities ${adapters_to_remove} --minlength ${params.clip_readlength} --minquality ${params.clip_min_read_quality} --minadapteroverlap ${params.min_adap_overlap}

mv ${base}.pe.pair*.truncated.gz *.settings output/
"""
912
913
914
915
916
"""
mkdir -p output
AdapterRemoval --file1 ${r1} --basename ${base}.se --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} ${preserve5p} --trimns --trimqualities ${adapters_to_remove} --minlength ${params.clip_readlength} --minquality ${params.clip_min_read_quality} --minadapteroverlap ${params.min_adap_overlap}
mv *.settings *.se.truncated.gz output/
"""
919
920
921
922
923
"""
mkdir -p output
AdapterRemoval --file1 ${r1} --basename ${base}.se --gzip --threads ${task.cpus} --qualitymax ${params.qualitymax} ${preserve5p} --adapter1 "" --adapter2 ""
mv *.settings *.se.truncated.gz output/
"""
996
997
998
"""
fastp --in1 ${r1} --trim_front1 ${params.post_ar_trim_front} --trim_tail1 ${params.post_ar_trim_tail} -A -G -Q -L -w ${task.cpus} --out1 "${libraryid}"_L"${lane}"_R1_postartrimmed.fq.gz
"""
1000
1001
1002
"""
fastp --in1 ${r1} --in2 ${r2}  --trim_front1 ${params.post_ar_trim_front} --trim_tail1 ${params.post_ar_trim_tail} --trim_front2 ${params.post_ar_trim_front2} --trim_tail2 ${params.post_ar_trim_tail2} -A -G -Q -L -w ${task.cpus} --out1 "${libraryid}"_L"${lane}"_R1_postartrimmed.fq.gz --out2 "${libraryid}"_L"${lane}"_R2_postartrimmed.fq.gz
"""
1132
1133
1134
1135
"""
cat ${r1} > "${libraryid}"_R1_lanemerged.fq.gz
cat ${r2} > "${libraryid}"_R2_lanemerged.fq.gz
"""
NextFlow From line 1132 of master/main.nf
1137
1138
1139
"""
cat ${r1} > "${libraryid}"_R1_lanemerged.fq.gz
"""
NextFlow From line 1137 of master/main.nf
1205
1206
1207
1208
"""
cat ${r1} > "${libraryid}"_R1_lanemerged.fq.gz
cat ${r2} > "${libraryid}"_R2_lanemerged.fq.gz
"""
NextFlow From line 1205 of master/main.nf
1210
1211
1212
"""
cat ${r1} > "${libraryid}"_R1_lanemerged.fq.gz
"""
NextFlow From line 1210 of master/main.nf
1238
1239
1240
"""
fastqc -t ${task.cpus} -q ${r1} ${r2}
"""
1242
1243
1244
"""
fastqc -t ${task.cpus} -q ${r1}
"""
1276
1277
1278
1279
1280
1281
"""
bwa aln -t ${task.cpus} $fasta ${r1} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -o ${params.bwaalno} -f ${libraryid}.r1.sai
bwa aln -t ${task.cpus} $fasta ${r2} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -o ${params.bwaalno} -f ${libraryid}.r2.sai
bwa sampe -r "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${samplename}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" $fasta ${libraryid}.r1.sai ${libraryid}.r2.sai ${r1} ${r2} | samtools sort -@ ${task.cpus - 1} -O bam - > ${libraryid}_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
1284
1285
1286
1287
1288
"""
bwa aln -t ${task.cpus} ${fasta} ${r1} -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -o ${params.bwaalno} -f ${libraryid}.sai
bwa samse -r "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${samplename}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" $fasta ${libraryid}.sai $r1 | samtools sort -@ ${task.cpus - 1} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
1316
1317
1318
1319
"""
bwa mem -t ${split_cpus} $fasta $r1 $r2 -R "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${samplename}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" | samtools sort -@ ${split_cpus} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
samtools index ${size} -@ ${task.cpus} "${libraryid}"_"${seqtype}".mapped.bam
"""
1321
1322
1323
1324
"""
bwa mem -t ${split_cpus} $fasta $r1 -R "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${samplename}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" | samtools sort -@ ${split_cpus} -O bam - > "${libraryid}"_"${seqtype}".mapped.bam
samtools index -@ ${task.cpus} "${libraryid}"_"${seqtype}".mapped.bam ${size} 
"""
1352
1353
1354
1355
"""
circulargenerator -Xmx${task.memory.toGiga()}g -e ${params.circularextension} -i $fasta -s ${params.circulartarget}
bwa index $prefix
"""
1382
1383
1384
1385
1386
1387
1388
1389
"""
bwa aln -t ${task.cpus} $elongated_root $r1 -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -f ${libraryid}.r1.sai
bwa aln -t ${task.cpus} $elongated_root $r2 -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -f ${libraryid}.r2.sai
bwa sampe -r "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${samplename}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" $elongated_root ${libraryid}.r1.sai ${libraryid}.r2.sai $r1 $r2 > tmp.out
realignsamfile -Xmx${task.memory.toGiga()}g -e ${params.circularextension} -i tmp.out -r $fasta $filter 
samtools sort -@ ${task.cpus} -O bam tmp_realigned.bam > ${libraryid}_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size} 
"""
1391
1392
1393
1394
1395
1396
1397
""" 
bwa aln -t ${task.cpus} $elongated_root $r1 -n ${params.bwaalnn} -l ${params.bwaalnl} -k ${params.bwaalnk} -f ${libraryid}.sai
bwa samse -r "@RG\\tID:ILLUMINA-${libraryid}\\tSM:${samplename}\\tPL:illumina\\tPU:ILLUMINA-${libraryid}-${seqtype}" $elongated_root ${libraryid}.sai $r1 > tmp.out
realignsamfile -Xmx${task.memory.toGiga()}g -e ${params.circularextension} -i tmp.out -r $fasta $filter 
samtools sort -@ ${task.cpus} -O bam tmp_realigned.bam > "${libraryid}"_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
1462
1463
1464
1465
"""
bowtie2 -x ${fasta} -1 ${r1} -2 ${r2} -p ${split_cpus} ${sensitivity} ${bt2n} ${bt2l} ${trim5} ${trim3} --maxins ${params.bt2_maxins} --rg-id ILLUMINA-${libraryid} --rg SM:${samplename} --rg PL:illumina --rg PU:ILLUMINA-${libraryid}-${seqtype} 2> "${libraryid}"_bt2.log | samtools sort -@ ${split_cpus} -O bam > "${libraryid}"_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
1468
1469
1470
1471
"""
bowtie2 -x ${fasta} -U ${r1} -p ${split_cpus} ${sensitivity} ${bt2n} ${bt2l} ${trim5} ${trim3} --rg-id ILLUMINA-${libraryid} --rg SM:${samplename} --rg PL:illumina --rg PU:ILLUMINA-${libraryid}-${seqtype} 2> "${libraryid}"_bt2.log | samtools sort -@ ${split_cpus} -O bam > "${libraryid}"_"${seqtype}".mapped.bam
samtools index "${libraryid}"_"${seqtype}".mapped.bam ${size}
"""
1537
1538
1539
1540
"""
samtools index $bam
extract_map_reads.py $bam ${r1} -m ${params.hostremoval_mode} $merged -of $out_fwd -t ${task.cpus} 
"""
1544
1545
1546
1547
"""
samtools index $bam
extract_map_reads.py $bam ${r1} -rev ${r2} -m ${params.hostremoval_mode} $merged -of $out_fwd -or $out_rev -t ${task.cpus}
""" 
1602
1603
1604
1605
"""
samtools merge ${libraryid}_seqtypemerged.bam ${bam}
samtools index ${libraryid}_seqtypemerged.bam ${size}
"""
1628
1629
1630
"""
samtools flagstat $bam > ${libraryid}_flagstat.stats
"""
1664
1665
1666
1667
"""
samtools view -h ${bam} -@ ${task.cpus} -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
samtools index ${libraryid}.filtered.bam ${size}
"""
1669
1670
1671
1672
"""
samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
samtools index ${libraryid}.filtered.bam ${size}
"""
1674
1675
1676
1677
1678
"""
samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
samtools index ${libraryid}.filtered.bam ${size}
"""
1680
1681
1682
1683
1684
1685
1686
1687
1688
"""
samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
samtools index ${libraryid}.filtered.bam ${size}

## FASTQ
samtools fastq -tN ${libraryid}.unmapped.bam | pigz -p ${task.cpus - 1} > ${libraryid}.unmapped.fastq.gz
rm ${libraryid}.unmapped.bam
"""
1690
1691
1692
1693
1694
1695
1696
1697
"""
samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > ${libraryid}.filtered.bam
samtools index ${libraryid}.filtered.bam ${size}

## FASTQ
samtools fastq -tN ${libraryid}.unmapped.bam | pigz -p ${task.cpus -1} > ${libraryid}.unmapped.fastq.gz
"""
1700
1701
1702
1703
1704
"""
samtools view -h ${bam} -@ ${task.cpus} -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
samtools index ${libraryid}.filtered.bam ${size}
"""
1706
1707
1708
1709
1710
"""
samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
samtools index ${libraryid}.filtered.bam ${size}
"""
1712
1713
1714
1715
1716
1717
"""
samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
samtools index ${libraryid}.filtered.bam ${size}
"""
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
"""
samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
samtools index ${libraryid}.filtered.bam ${size}

## FASTQ
samtools fastq -tN ${libraryid}.unmapped.bam | pigz -p ${task.cpus - 1} > ${libraryid}.unmapped.fastq.gz
rm ${libraryid}.unmapped.bam
"""
1730
1731
1732
1733
1734
1735
1736
1737
1738
"""
samtools view -h ${bam} -@ ${task.cpus} -f4 -b > ${libraryid}.unmapped.bam
samtools view -h ${bam} -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -b > tmp_mapped.bam
filter_bam_fragment_length.py -a -l ${params.bam_filter_minreadlength} -o ${libraryid} tmp_mapped.bam
samtools index ${libraryid}.filtered.bam ${size}

## FASTQ
samtools fastq -tN ${libraryid}.unmapped.bam | pigz -p ${task.cpus} > ${libraryid}.unmapped.fastq.gz
"""
1771
1772
1773
"""
samtools flagstat $bam > ${libraryid}_postfilterflagstat.stats
"""
1815
1816
1817
"""
endorS.py -o json -n ${libraryid} ${stats} ${poststats}
"""
1819
1820
1821
"""
endorS.py -o json -n ${libraryid} ${stats}
"""
1850
1851
1852
1853
1854
1855
1856
"""
mv ${bam} ${libraryid}.bam
dedup -Xmx${task.memory.toGiga()}g -i ${libraryid}.bam $treat_merged -o . -u 
mv *.log dedup.log
samtools sort -@ ${task.cpus} "${libraryid}"_rmdup.bam -o "${libraryid}"_rmdup.bam
samtools index "${libraryid}"_rmdup.bam ${size}
"""
1858
1859
1860
1861
1862
1863
"""
dedup -Xmx${task.memory.toGiga()}g -i ${libraryid}.bam $treat_merged -o . -u 
mv *.log dedup.log
samtools sort -@ ${task.cpus} "${libraryid}"_rmdup.bam -o "${libraryid}"_rmdup.bam
samtools index "${libraryid}"_rmdup.bam ${size}
"""
1888
1889
1890
1891
1892
"""
mv ${bam} ${libraryid}.bam
picard -Xmx${task.memory.toMega()}M MarkDuplicates INPUT=${libraryid}.bam OUTPUT=${libraryid}_rmdup.bam REMOVE_DUPLICATES=TRUE AS=TRUE METRICS_FILE="${libraryid}_rmdup.metrics" VALIDATION_STRINGENCY=SILENT
samtools index ${libraryid}_rmdup.bam ${size}
"""
1894
1895
1896
1897
"""
picard -Xmx${task.memory.toMega()}M MarkDuplicates INPUT=${libraryid}.bam OUTPUT=${libraryid}_rmdup.bam REMOVE_DUPLICATES=TRUE AS=TRUE METRICS_FILE="${libraryid}_rmdup.metrics" VALIDATION_STRINGENCY=SILENT
samtools index ${libraryid}_rmdup.bam ${size}
"""
1972
1973
1974
1975
"""
samtools merge ${samplename}_udg${udg}_libmerged_rmdup.bam ${bam}
samtools index ${samplename}_udg${udg}_libmerged_rmdup.bam ${size}
"""
2022
2023
2024
"""
preseq c_curve -s ${params.preseq_step_size} -o ${input.baseName}.preseq -H ${input}
"""
2026
2027
2028
"""
preseq c_curve -s ${params.preseq_step_size} -o ${input.baseName}.preseq -B ${input} ${pe_mode}
"""
2030
2031
2032
"""
preseq c_curve -s ${params.preseq_step_size} -o ${input.baseName}.preseq -B ${input} ${pe_mode}
"""
2034
2035
2036
"""
preseq lc_extrap -s ${params.preseq_step_size} -o ${input.baseName}.preseq -H ${input} -n ${params.preseq_bootstrap} -e ${params.preseq_maxextrap} -cval ${params.preseq_cval} -x ${params.preseq_terms}
"""
2038
2039
2040
"""
preseq lc_extrap -s ${params.preseq_step_size} -o ${input.baseName}.preseq -B ${input} ${pe_mode} -n ${params.preseq_bootstrap} -e ${params.preseq_maxextrap} -cval ${params.preseq_cval} -x ${params.preseq_terms}
"""
2042
2043
2044
"""
preseq lc_extrap -s ${params.preseq_step_size} -o ${input.baseName}.preseq -B ${input} ${pe_mode} -n ${params.preseq_bootstrap} -e ${params.preseq_maxextrap} -cval ${params.preseq_cval} -x ${params.preseq_terms}
"""
2066
2067
2068
2069
2070
2071
2072
2073
"""
## Create genome file from bam header
samtools view -H ${bam} | grep '@SQ' | sed 's#@SQ\tSN:\\|LN:##g' > genome.txt

##  Run bedtools
bedtools coverage -nonamecheck -g genome.txt -sorted -a ${anno_file} -b ${bam} | pigz -p ${task.cpus - 1} > "${bam.baseName}".breadth.gz
bedtools coverage -nonamecheck -g genome.txt -sorted -a ${anno_file} -b ${bam} -mean | pigz -p ${task.cpus - 1} > "${bam.baseName}".depth.gz
"""
2104
2105
2106
"""
damageprofiler -Xmx${task.memory.toGiga()}g -i $bam -r $fasta -l ${params.damageprofiler_length} -t ${params.damageprofiler_threshold} -o . -yaxis_damageplot ${params.damageprofiler_yaxis}
"""
NextFlow From line 2104 of master/main.nf
2134
2135
2136
2137
"""
mapDamage -i ${bam} -r ${fasta} --rescale --rescale-out="${base}_rescaled.bam" --seq-length=${params.rescale_seqlength} ${rescale_length_5p} ${rescale_length_3p} ${singlestranded}
samtools index ${base}_rescaled.bam ${size}
"""
2159
2160
2161
"""
bedtools maskfasta -fi ${fasta} -bed ${bedfile} -fo ${fasta.baseName}_masked.fa
"""
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
"""
#Run Filtering step 
samtools calmd ${bam} ${fasta} | pmdtools --threshold ${params.pmdtools_threshold} ${treatment} --header | samtools view -Sb - > "${libraryid}".pmd.bam

#Run Calc Range step
## To allow early shut off of pipe: https://github.com/nextflow-io/nextflow/issues/1564
trap 'if [[ \$? == 141 ]]; then echo "Shutting samtools early due to -n parameter" && samtools index ${libraryid}.pmd.bam ${size}; exit 0; fi' EXIT
samtools calmd ${bam} ${fasta} | pmdtools --deamination ${platypus} --range ${params.pmdtools_range} ${treatment} -n ${params.pmdtools_max_reads} > "${libraryid}".cpg.range."${params.pmdtools_range}".txt

samtools index ${libraryid}.pmd.bam ${size}
"""
2246
2247
2248
2249
2250
"""
bam trimBam $bam tmp.bam -L ${left_clipping} -R ${right_clipping} ${softclip}
samtools sort -@ ${task.cpus} tmp.bam -o ${libraryid}.trimmed.bam 
samtools index ${libraryid}.trimmed.bam ${size}
"""
2296
2297
2298
2299
"""
samtools merge ${samplename}_libmerged_add.bam ${bam}
samtools index ${samplename}_libmerged_add.bam ${size}
"""
2326
2327
2328
"""
qualimap bamqc -bam $bam -nt ${task.cpus} -outdir . -outformat "HTML" ${snpcap} --java-mem-size=${task.memory.toGiga()}G
"""
2392
2393
2394
2395
"""
picard -Xmx${task.memory.toGiga()}g AddOrReplaceReadGroups I=${bam} O=${samplename}_rg.bam RGID=1 RGLB="${samplename}_rg" RGPL=illumina RGPU=4410 RGSM="${samplename}_rg" VALIDATION_STRINGENCY=LENIENT
samtools index ${samplename}_rg.bam ${size}
"""
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
"""
samtools index -b ${bam}
gatk3 -Xmx${task.memory.toGiga()}g -T RealignerTargetCreator -R ${fasta} -I ${bam} -nt ${task.cpus} -o ${samplename}.intervals ${defaultbasequalities}
gatk3 -Xmx${task.memory.toGiga()}g -T IndelRealigner -R ${fasta} -I ${bam} -targetIntervals ${samplename}.intervals -o ${samplename}.realign.bam ${defaultbasequalities}
gatk3 -Xmx${task.memory.toGiga()}g -T UnifiedGenotyper -R ${fasta} -I ${samplename}.realign.bam -o ${samplename}.unifiedgenotyper.vcf -nt ${task.cpus} --genotype_likelihoods_model ${params.gatk_ug_genotype_model} -stand_call_conf ${params.gatk_call_conf} --sample_ploidy ${params.gatk_ploidy} -dcov ${params.gatk_downsample} --output_mode ${params.gatk_ug_out_mode} ${defaultbasequalities}

$keep_realign

bgzip -@ ${task.cpus} ${samplename}.unifiedgenotyper.vcf
"""
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
"""
samtools index ${bam}
gatk3 -Xmx${task.memory.toGiga()}g -T RealignerTargetCreator -R ${fasta} -I ${bam} -nt ${task.cpus} -o ${samplename}.intervals ${defaultbasequalities}
gatk3 -Xmx${task.memory.toGiga()}g -T IndelRealigner -R ${fasta} -I ${bam} -targetIntervals ${samplenane}.intervals -o ${samplename}.realign.bam ${defaultbasequalities}
gatk3 -Xmx${task.memory.toGiga()}g -T UnifiedGenotyper -R ${fasta} -I ${samplename}.realign.bam -o ${samplename}.unifiedgenotyper.vcf -nt ${task.cpus} --dbsnp ${params.gatk_dbsnp} --genotype_likelihoods_model ${params.gatk_ug_genotype_model} -stand_call_conf ${params.gatk_call_conf} --sample_ploidy ${params.gatk_ploidy} -dcov ${params.gatk_downsample} --output_mode ${params.gatk_ug_out_mode} ${defaultbasequalities}

$keep_realign

bgzip -@  ${task.cpus} ${samplename}.unifiedgenotyper.vcf
"""
2473
2474
2475
2476
"""
gatk HaplotypeCaller --java-options "-Xmx${task.memory.toGiga()}G" -R ${fasta} -I ${bam} -O ${samplename}.haplotypecaller.vcf -stand-call-conf ${params.gatk_call_conf} --sample-ploidy ${params.gatk_ploidy} --output-mode ${params.gatk_hc_out_mode} --emit-ref-confidence ${params.gatk_hc_emitrefconf}
bgzip -@ ${task.cpus} ${samplename}.haplotypecaller.vcf
"""
2479
2480
2481
2482
"""
gatk HaplotypeCaller --java-options "-Xmx${task.memory.toGiga()}G" -R ${fasta} -I ${bam} -O ${samplename}.haplotypecaller.vcf --dbsnp ${params.gatk_dbsnp} -stand-call-conf ${params.gatk_call_conf} --sample_ploidy ${params.gatk_ploidy} --output_mode ${params.gatk_hc_out_mode} --emit-ref-confidence ${params.gatk_hc_emitrefconf}
bgzip -@  ${task.cpus} ${samplename}.haplotypecaller.vcf
"""
2506
2507
2508
2509
"""
freebayes -f ${fasta} -p ${params.freebayes_p} -C ${params.freebayes_C} ${skip_coverage} ${bam} > ${samplename}.freebayes.vcf
bgzip -@  ${task.cpus} ${samplename}.freebayes.vcf
"""
2587
2588
2589
"""
samtools mpileup -B --ignore-RG -q ${map_q} -Q ${base_q} ${use_bed} -f ${fasta} ${bam_list} | pileupCaller ${caller} ${ssmode} ${transitions_mode} --sampleNames ${sample_names} ${use_snp} -e pileupcaller.${strandedness}
"""
2610
2611
2612
"""
eigenstrat_snp_coverage -i pileupcaller.${strandedness} >${strandedness}_eigenstrat_coverage.txt -j ${strandedness}_eigenstrat_coverage_mqc.json
"""
NextFlow From line 2610 of master/main.nf
2614
2615
2616
2617
"""
eigenstrat_snp_coverage -i pileupcaller.${strandedness} >${strandedness}_eigenstrat_coverage.txt
parse_snp_cov.py ${strandedness}_eigenstrat_coverage.txt
"""
NextFlow From line 2614 of master/main.nf
2662
2663
2664
2665
2666
"""
echo ${bam} > bam.filelist
mkdir angsd
angsd -bam bam.filelist -nThreads ${task.cpus} -GL ${angsd_glmodel} -doGlF ${angsd_glformat} ${angsd_majorminor} ${angsd_fasta} -out ${samplename}.angsd
"""
2689
2690
2691
"""
bcftools stats *.vcf.gz -F ${fasta} > ${samplename}.vcf.stats
"""
2718
2719
2720
2721
2722
2723
"""
pigz -d -f -p ${task.cpus} ${vcf}
vcf2genome -Xmx${task.memory.toGiga()}g -draft ${out} -draftname "${fasta_head}" -in ${vcf.baseName} -minc ${params.vcf2genome_minc} -minfreq ${params.vcf2genome_minfreq} -minq ${params.vcf2genome_minq} -ref ${fasta} -refMod ${out}_refmod.fasta -uncertain ${out}_uncertainty.fasta
pigz -f -p ${task.cpus} ${out}*
bgzip -@ ${task.cpus} *.vcf
"""
2761
2762
2763
2764
2765
2766
"""
pigz -d -f -p ${task.cpus} ${vcf}
multivcfanalyzer -Xmx${task.memory.toGiga()}g ${params.snp_eff_results} ${fasta} ${params.reference_gff_annotations} . ${write_freqs} ${params.min_genotype_quality} ${params.min_base_coverage} ${params.min_allele_freq_hom} ${params.min_allele_freq_het} ${params.reference_gff_exclude} *.vcf
pigz -p ${task.cpus} *.tsv *.txt snpAlignment.fasta snpAlignmentIncludingRefGenome.fasta fullAlignment.fasta
bgzip -@ ${task.cpus} *.vcf
"""
NextFlow From line 2761 of master/main.nf
2791
2792
2793
"""
mtnucratio -Xmx${task.memory.toGiga()}g ${bam} "${params.mtnucratio_header}"
"""
2812
2813
2814
"""
mv ${bam} ${bam.baseName}_${strandedness}strand.bam
"""
NextFlow From line 2812 of master/main.nf
2836
2837
2838
2839
"""
ls *.bam >> bamlist.txt
samtools depth -aa -q30 -Q30 $filter -f bamlist.txt | sexdeterrmine -f bamlist.txt > SexDet.txt
"""
2859
2860
2861
2862
2863
"""
samtools index ${input}
angsd -i ${input} -r ${params.contamination_chrom_name}:5000000-154900000 -doCounts 1 -iCounts 1 -minMapQ 30 -minQ 30 -out ${libraryid}.doCounts
contamination -a ${libraryid}.doCounts.icnts.gz -h ${projectDir}/assets/angsd_resources/HapMapChrX.gz 2> ${libraryid}.X.contamination.out
"""
2882
2883
2884
"""
print_x_contamination.py ${Contam.join(' ')}
"""
NextFlow From line 2882 of master/main.nf
2910
2911
2912
"""
bbduk.sh -Xmx${task.memory.toGiga()}g in=${fastq} threads=${task.cpus} entropymask=f entropy=${params.metagenomic_complexity_entropy} out=${fastq}_lowcomplexityremoved.fq.gz 2> ${fastq}_bbduk.stats
"""
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
"""
malt-run \
-J-Xmx${task.memory.toGiga()}g \
-t ${task.cpus} \
-v \
-o . \
-d ${db} \
${sam_out} \
-id ${params.percent_identity} \
-m ${params.malt_mode} \
-at ${params.malt_alignment_mode} \
-top ${params.malt_top_percent} \
${min_supp} \
-mq ${params.malt_max_queries} \
--memoryMode ${params.malt_memory_mode} \
-i ${fastqs.join(' ')} |&tee malt.log
"""
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
"""
MaltExtract \
-Xmx${task.memory.toGiga()}g \
-t ${taxon_list} \
-i ${rma6.join(' ')} \
-o results/ \
-r ${ncbifiles} \
-p ${task.cpus} \
-f ${params.maltextract_filter} \
-a ${params.maltextract_toppercent} \
--minPI ${params.maltextract_percentidentity} \
${destack} \
${downsam} \
${dupremo} \
${matches} \
${megsum} \
${topaln} \
${ss}

postprocessing.AMPS.r -r results/ -m ${params.maltextract_filter} -t ${task.cpus} -n ${taxon_list} -j
"""
3052
3053
3054
3055
3056
"""
tar xvzf $ckdb
mkdir -p $dbname
mv *.k2d $dbname || echo "nothing to do"
"""
NextFlow From line 3052 of master/main.nf
3087
3088
3089
3090
"""
kraken2 --db ${krakendb} --threads ${task.cpus} --output $out --report-minimizer-data --report $kreport $fastq
cut -f1-3,6-8 $kreport > $kreport_old
"""
3106
3107
3108
"""
kraken_parse.py -c ${params.metagenomic_min_support_reads} -or $read_out -ok $kmer_out $kraken_r
"""    
NextFlow From line 3106 of master/main.nf
3123
3124
3125
"""
merge_kraken_res.py -or $read_out -ok $kmer_out
"""    
NextFlow From line 3123 of master/main.nf
3146
3147
3148
"""
markdown_to_html.py $output_docs -o results_description.html
"""
NextFlow From line 3146 of master/main.nf
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
"""
echo $workflow.manifest.version &> v_pipeline.txt
echo $workflow.nextflow.version &> v_nextflow.txt

fastqc -t ${task.cpus} --version &> v_fastqc.txt 2>&1 || true
AdapterRemoval --version  &> v_adapterremoval.txt 2>&1 || true
fastp --version &> v_fastp.txt 2>&1 || true
bwa &> v_bwa.txt 2>&1 || true
circulargenerator -Xmx${task.memory.toGiga()}g --help | head -n 1 &> v_circulargenerator.txt 2>&1 || true
samtools --version &> v_samtools.txt 2>&1 || true
dedup -Xmx${task.memory.toGiga()}g -v &> v_dedup.txt 2>&1 || true
## bioconda recipe of picard is incorrectly set up and extra warning made with stderr, this ugly command ensures only version exported
( exec 7>&1; picard -Xmx${task.memory.toMega()}M MarkDuplicates --version 2>&1 >&7 | grep -v '/' >&2 ) 2> v_markduplicates.txt || true
qualimap --version --java-mem-size=${task.memory.toGiga()}G &> v_qualimap.txt 2>&1 || true
preseq &> v_preseq.txt 2>&1 || true
gatk --java-options "-Xmx${task.memory.toGiga()}G" --version 2>&1 | grep '(GATK)' > v_gatk.txt 2>&1 || true
gatk3 -Xmx${task.memory.toGiga()}g  --version 2>&1 | head -n 1 > v_gatk3.txt 2>&1 || true
freebayes --version &> v_freebayes.txt 2>&1 || true
bedtools --version &> v_bedtools.txt 2>&1 || true
damageprofiler -Xmx${task.memory.toGiga()}g --version &> v_damageprofiler.txt 2>&1 || true
bam --version &> v_bamutil.txt 2>&1 || true
pmdtools --version &> v_pmdtools.txt 2>&1 || true
angsd -h |& head -n 1 | cut -d ' ' -f3-4 &> v_angsd.txt 2>&1 || true 
multivcfanalyzer -Xmx${task.memory.toGiga()}g --help | head -n 1 &> v_multivcfanalyzer.txt 2>&1 || true
malt-run -J-Xmx${task.memory.toGiga()}g --help |& tail -n 3 | head -n 1 | cut -f 2 -d'(' | cut -f 1 -d ',' &> v_malt.txt 2>&1 || true
MaltExtract -Xmx${task.memory.toGiga()}g --help | head -n 2 | tail -n 1 &> v_maltextract.txt 2>&1 || true
multiqc --version &> v_multiqc.txt 2>&1 || true
vcf2genome -Xmx${task.memory.toGiga()}g -h |& head -n 1 &> v_vcf2genome.txt || true
mtnucratio -Xmx${task.memory.toGiga()}g --help &> v_mtnucratiocalculator.txt || true
sexdeterrmine --version &> v_sexdeterrmine.txt || true
kraken2 --version | head -n 1 &> v_kraken.txt || true
endorS.py --version &> v_endorSpy.txt || true
pileupCaller --version &> v_sequencetools.txt 2>&1 || true
bowtie2 --version | grep -a 'bowtie2-.* -fdebug' > v_bowtie2.txt || true
eigenstrat_snp_coverage --version | cut -d ' ' -f2 >v_eigenstrat_snp_coverage.txt || true
mapDamage --version > v_mapdamage.txt || true
bbversion.sh > v_bbduk.txt || true
bcftools --version | grep 'bcftools' | cut -d ' ' -f 2 > v_bcftools.txt || true
scrape_software_versions.py &> software_versions_mqc.yaml
"""
3263
3264
3265
"""
multiqc -f $rtitle $rfilename $multiqc_config $custom_config_file .
"""
ShowHide 77 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...