Analysis pipeline for CUT&RUN and CUT&TAG experiments that includes QC, support for spike-ins, IgG controls, peak calling and downstream analysis.

public 1yr ago Version: 3.1 0 bookmarks

Help improve this workflow!

This workflow has been published but could be further improved with some additional meta data:

Keyword(s) in categories input, output, operation

You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .

Introduction

nf-core/cutandrun is a best-practice bioinformatic analysis pipeline for CUT&RUN, CUT&Tag, and TIPseq experimental protocols that were developed to study protein-DNA interactions and epigenomic profiling.

CUT&RUN

Meers, M. P., Bryson, T. D., Henikoff, J. G., & Henikoff, S. (2019). Improved CUT&RUN chromatin profiling tools. eLife , 8 . https://doi.org/10.7554/eLife.46314

CUT&Tag

Kaya-Okur, H. S., Wu, S. J., Codomo, C. A., Pledger, E. S., Bryson, T. D., Henikoff, J. G., Ahmad, K., & Henikoff, S. (2019). CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nature Communications , 10 (1), 1930. https://doi.org/10.1038/s41467-019-09982-5]

TIPseq

Bartlett, D. A., Dileep, V., Handa, T., Ohkawa, Y., Kimura, H., Henikoff, S., & Gilbert, D. M. (2021). High-throughput single-cell epigenomic profiling by targeted insertion of promoters (TIP-seq). Journal of Cell Biology, 220(12), e202103078. https://doi.org/10.1083/jcb.202103078

The pipeline is built using Nextflow , a workflow tool to run tasks across multiple compute infrastructures in a portable, reproducible manner. It is capable of using containerisation and package management making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process, which makes it easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules .

The pipeline has been developed with continuous integration (CI) and test driven development (TDD) at its core. nf-core code and module linting as well as a battery of over 100 unit and integration tests run on pull request to the main repository and on release of the pipeline. On official release, automated CI tests run the pipeline on a full-sized dataset on AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources. The results obtained from the full-sized test can be viewed on the nf-core website .

pipeline_diagram

Pipeline summary

Check input files
Merge re-sequenced FastQ files ( cat )
Read QC ( FastQC )
Adapter and quality trimming ( Trim Galore! )
Alignment to both target and spike-in genomes ( Bowtie 2 )
Filter on quality, sort and index alignments ( samtools )
Duplicate read marking ( picard )
Create bedGraph files ( bedtools
Create bigWig coverage files ( bedGraphToBigWig )
Peak calling ( SEACR , MACS2 )
Consensus peak merging and reporting ( bedtools )
Library complexity ([preseq]( Preseq | The Smith Lab ))
Fragment-based quality control ( deepTools )
Peak-based quality control ( bedtools , custom python)
Heatmap peak analysis ( deepTools )
Genome browser session ( IGV )
Present all QC in web-based report ( MultiQC )

Usage

Note If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

samplesheet.csv :

group,replicate,fastq_1,fastq_2,control
h3k27me3,1,h3k27me3_rep1_r1.fastq.gz,h3k27me3_rep1_r2.fastq.gz,igg_ctrl
h3k27me3,2,h3k27me3_rep2_r1.fastq.gz,h3k27me3_rep2_r2.fastq.gz,igg_ctrl
igg_ctrl,1,igg_rep1_r1.fastq.gz,igg_rep1_r2.fastq.gz,
igg_ctrl,2,igg_rep2_r1.fastq.gz,igg_rep2_r2.fastq.gz,

Each row represents a pair of fastq files (paired end).

Now, you can run the pipeline using:

nextflow run nf-core/cutandrun
-profile <docker/singularity/.../institute>
--input samplesheet.csv
--peakcaller 'seacr,MACS2'
--genome GRCh38
--outdir

Warning: Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters ; see docs .

Typical command for CUT&Run/CUT&Tag/TIPseq analysis:

Pipeline output

To see the the results of a test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation .

Credits

nf-core/cutandrun was originally written by Chris Cheshire ( @chris-cheshire ) and Charlotte West ( @charlotte-west ) from Luscombe Lab at The Francis Crick Institute , London, UK.

The pipeline structure and parts of the downstream analysis were adapted from the original CUT&Tag analysis protocol from the Henikoff Lab . The removal of duplicates arising from linear amplification (also known as T7 duplicates) in the TIPseq protocol was implemented as described in the original TIPseq paper .

We thank Harshil Patel ( @drpatelh ) and everyone in the Luscombe Lab ( @luslab ) for their extensive assistance in the development of this pipeline.

The Francis Crick Institute

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines .

For further information or help, don't hesitate to get in touch on the Slack #cutandrun channel (you can join with this invite ).

Citations

If you use nf-core/cutandrun for your analysis, please cite it using the following doi: 10.5281/zenodo.5653535

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x .

Code Snippets

"""
#!/usr/bin/env python

import yaml
import platform
from textwrap import dedent

def _make_versions_html(versions):
    html = [
        dedent(
            '''\\
            <style>
            #nf-core-versions tbody:nth-child(even) {
                background-color: #f2f2f2;
            }
            </style>
            <table class="table" style="width:100%" id="nf-core-versions">
                <thead>
                    <tr>
                        <th> Process Name </th>
                        <th> Software </th>
                        <th> Version  </th>
                    </tr>
                </thead>
            '''
        )
    ]
    for process, tmp_versions in sorted(versions.items()):
        html.append("<tbody>")
        for i, (tool, version) in enumerate(sorted(tmp_versions.items())):
            html.append(
                dedent(
                    f'''\\
                    <tr>
                        <td><samp>{process if (i == 0) else ''}</samp></td>
                        <td><samp>{tool}</samp></td>
                        <td><samp>{version}</samp></td>
                    </tr>
                    '''
                )
            )
        html.append("</tbody>")
    html.append("</table>")
    return "\\n".join(html)

def _make_versions_unique_html(versions):
    unique_versions = []

    for process, tmp_versions in sorted(versions.items()):
        for i, (tool, version) in enumerate(sorted(tmp_versions.items())):
            tool_version = tool + "=" + version
            if tool_version not in unique_versions:
                unique_versions.append(tool_version)

    unique_versions.sort()

    html = [
        dedent(
            '''\\
            <style>
            #nf-core-versions-unique tbody:nth-child(even) {
                background-color: #f2f2f2;
            }
            </style>
            <table class="table" style="width:100%" id="nf-core-versions-unique">
                <thead>
                    <tr>
                        <th> Software </th>
                        <th> Version  </th>
                    </tr>
                </thead>
            '''
        )
    ]

    for tool_version in unique_versions:
        tool_version_split = tool_version.split('=')
        html.append("<tbody>")
        html.append(
            dedent(
                f'''\\
                <tr>
                    <td><samp>{tool_version_split[0]}</samp></td>
                    <td><samp>{tool_version_split[1]}</samp></td>
                </tr>
                '''
            )
        )
        html.append("</tbody>")
    html.append("</table>")
    return "\\n".join(html)

module_versions = {}
module_versions["${task.process}"] = {
    'python': platform.python_version(),
    'yaml': yaml.__version__
}

with open("$versions") as f:
    workflow_versions = yaml.load(f, Loader=yaml.BaseLoader) | module_versions

workflow_versions["Workflow"] = {
    "Nextflow": "$workflow.nextflow.version",
    "$workflow.manifest.name": "$workflow.manifest.version"
}

versions_mqc = {
    'parent_id': 'software_versions',
    'parent_name': 'Software Versions',
    'parent_description': 'Details software versions used in the pipeline run',
    'id': 'software-versions-by-process',
    'section_name': '${workflow.manifest.name} software versions by process',
    'section_href': 'https://github.com/${workflow.manifest.name}',
    'plot_type': 'html',
    'description': 'are collected at run time from the software output.',
    'data': _make_versions_html(workflow_versions)
}

versions_mqc_unique = {
    'parent_id': 'software_versions',
    'parent_name': 'Software Versions',
    'parent_description': 'Details software versions used in the pipeline run',
    'id': 'software-versions-unique',
    'section_name': '${workflow.manifest.name} Software Versions',
    'section_href': 'https://github.com/${workflow.manifest.name}',
    'plot_type': 'html',
    'description': 'are collected at run time from the software output.',
    'data': _make_versions_unique_html(workflow_versions)
}

with open("software_versions.yml", 'w') as f:
    yaml.dump(workflow_versions, f, default_flow_style=False)

with open("software_versions_mqc.yml", 'w') as f:
    yaml.dump(versions_mqc, f, default_flow_style=False)

with open("software_versions_unique_mqc.yml", 'w') as f:
    yaml.dump(versions_mqc_unique, f, default_flow_style=False)

with open('local_versions.yml', 'w') as f:
    yaml.dump(module_versions, f, default_flow_style=False)
"""

NextFlow PyYAML From line 23 of local/custom_dumpsoftwareversions.nf

"""
bedtools \\
    sort \\
    -i $intervals \\
    $sizes \\
    $args \\
    > ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 26 of sort/main.nf

"""
bamCoverage \
--bam $input \
$args \
--scaleFactor ${scale} \
--numberOfProcessors ${task.cpus} \
--outFileName ${prefix}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    deeptools: \$(bamCoverage --version | sed -e "s/bamCoverage //g")
END_VERSIONS
"""

NextFlow DeepTools From line 25 of bamcoverage/main.nf

"""
samtools \\
    view \\
    --threads ${task.cpus-1} \\
    ${reference} \\
    ${blacklist} \\
    $args \\
    $input \\
    $args2 \\
    > ${prefix}.${file_type}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 31 of view/main.nf

"""
touch ${prefix}.bam
touch ${prefix}.cram

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 50 of view/main.nf

"""
[ ! -f  ${prefix}.fastq.gz ] && ln -s $reads ${prefix}.fastq.gz
trim_galore \\
    $args \\
    --cores $cores \\
    --gzip \\
    $c_r1 \\
    $tpc_r1 \\
    ${prefix}.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    trim_galore: \$(echo \$(trim_galore --version 2>&1) | sed 's/^.*version //; s/Last.*\$//')
    cutadapt: \$(cutadapt --version)
END_VERSIONS
"""

NextFlow Trim_Galore From line 51 of trimgalore/main.nf

"""
[ ! -f  ${meta.id}_1.fastq.gz ] && ln -s ${reads[0]} ${meta.id}_1.fastq.gz
[ ! -f  ${meta.id}_2.fastq.gz ] && ln -s ${reads[1]} ${meta.id}_2.fastq.gz
trim_galore \\
    $args \\
    --cores $cores \\
    --paired \\
    --gzip \\
    $c_r1 \\
    $c_r2 \\
    $tpc_r1 \\
    $tpc_r2 \\
    ${meta.id}_1.fastq.gz \\
    ${meta.id}_2.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    trim_galore: \$(echo \$(trim_galore --version 2>&1) | sed 's/^.*version //; s/Last.*\$//')
    cutadapt: \$(cutadapt --version)
END_VERSIONS

mv ${meta.id}_1_val_1.fq.gz ${prefix_1}.fastq.gz
mv ${meta.id}_2_val_2.fq.gz ${prefix_2}.fastq.gz

[ ! -f  ${meta.id}_1_val_1_fastqc.html ] || mv ${meta.id}_1_val_1_fastqc.html ${meta.id}_1${suffix}_fastqc.html
[ ! -f  ${meta.id}_2_val_2_fastqc.html ] || mv ${meta.id}_2_val_2_fastqc.html ${meta.id}_2${suffix}_fastqc.html

[ ! -f  ${meta.id}_1_val_1_fastqc.zip ] || mv ${meta.id}_1_val_1_fastqc.zip ${meta.id}_1${suffix}_fastqc.zip
[ ! -f  ${meta.id}_2_val_2_fastqc.zip ] || mv ${meta.id}_2_val_2_fastqc.zip ${meta.id}_2${suffix}_fastqc.zip
"""

NextFlow Trim_Galore From line 68 of trimgalore/main.nf

"""
gtf2bed \\
    $args \\
    $gtf \\
    > ${gtf.baseName}.bed
cat <<-END_VERSIONS > versions.yml
"${task.process}":
    perl: \$(echo \$(perl --version 2>&1) | sed 's/.*v\\(.*\\)) built.*/\\1/')
END_VERSIONS
"""

NextFlow GFFutils From line 22 of local/gtf2bed.nf

"""
awk $args $command $input $command2 > ${prefix}.awk.${ext}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    awk: \$(awk -Wversion 2>/dev/null | head -n 1 | awk '{split(\$0,a,","); print a[1];}' | egrep -o "([0-9]{1,}\\.)+[0-9]{1,}")
END_VERSIONS
"""

NextFlow From line 27 of linux/awk.nf

"""
awk $args -f $script $input > ${prefix}.awk.txt

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    awk: \$(awk -Wversion 2>/dev/null | head -n 1 | awk '{split(\$0,a,","); print a[1];}' | egrep -o "([0-9]{1,}\\.)+[0-9]{1,}")
END_VERSIONS
"""

NextFlow From line 24 of linux/awk_script.nf

"""
cut $args $input $command > ${prefix}.cut.${ext}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cut: \$(cut --version | head -n 1 | awk '{print \$4;}')
END_VERSIONS
"""

NextFlow From line 26 of linux/cut.nf

"""
sort -T '.' $args $input_files > ${prefix}.sort.${ext}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    sort: \$(sort --version | head -n 1 | awk '{print \$4;}')
END_VERSIONS
"""

NextFlow From line 28 of linux/sort.nf

"""
multiqc -f $args $custom_config .

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    multiqc: \$( multiqc --version | sed -e "s/multiqc, version //g" )
END_VERSIONS
"""

NextFlow MultiQC From line 47 of local/multiqc.nf

"""
cat ${bed} | wc -l | awk -v OFS='\t' '{ print "Peak Count", \$1 }' | cat $peak_counts_header - > ${prefix}_mqc.tsv

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 23 of local/peak_counts.nf

"""
READS_IN_PEAKS=\$(bedtools intersect -a ${fragments_bed} -b ${peaks_bed} -bed -c -f $min_frip_overlap |  awk -F '\t' '{sum += \$NF} END {print sum * 2}')
grep -m 1 'mapped (' ${flagstat} | awk -v a="\$READS_IN_PEAKS" -v OFS='\t' '{print "Peak FRiP Score", a/\$1}' | cat $frip_score_header - > ${prefix}_mqc.tsv

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 24 of local/peak_frip.nf

"""
find_unique_reads.py \\
    --bed_path $input \\
    --output_path "${prefix}_unique_alignments.txt" \\
    --metrics_path "${prefix}_metrics.txt" \\
    --header_path $mqc_header \\
    --mqc_path "${prefix}_mqc.tsv"
cat <<-END_VERSIONS > versions.yml
"${task.process}":
    python: \$(python --version | grep -E -o \"([0-9]{1,}\\.)+[0-9]{1,}\")
END_VERSIONS
"""

NextFlow From line 25 of python/find_unique_reads.nf

"""
calc_frag_hist.py \\
    --frag_path "*len.txt" \\
    --output frag_len_hist.txt

if [ -f "frag_len_hist.txt" ]; then
    cat $frag_len_header_multiqc frag_len_hist.txt > frag_len_mqc.yml
fi

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    python: \$(python --version | grep -E -o \"([0-9]{1,}\\.)+[0-9]{1,}\")
    numpy: \$(python -c 'import numpy; print(numpy.__version__)')
    pandas: \$(python -c 'import pandas; print(pandas.__version__)')
    seaborn: \$(python -c 'import seaborn; print(seaborn.__version__)')
END_VERSIONS
"""

NextFlow From line 21 of python/frag_len_hist.nf

"""
echo "$output" > exp_files.txt
find -L * -iname "*.gtf" -exec echo -e {}"\\t0,48,73" \\; > gtf.igv.txt
find -L * -iname "*.gff" -exec echo -e {}"\\t0,48,73" \\; > gff.igv.txt
cat *.txt > igv_files.txt
igv_files_to_session.py igv_session.xml igv_files.txt $genome $gtf_bed --path_prefix './'

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    python: \$(python --version | grep -E -o \"([0-9]{1,}\\.)+[0-9]{1,}\")
END_VERSIONS
"""

NextFlow From line 51 of python/igv_session.nf

"""
peak_reproducibility.py \\
    --sample_id $meta.id \\
    --intersect $bed \\
    --threads ${task.cpus} \\
    --outpath .

cat $peak_reprod_header_multiqc *peak_repro.tsv > ${prefix}_mqc.tsv

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    python: \$(python --version | grep -E -o \"([0-9]{1,}\\.)+[0-9]{1,}\")
    dask: \$(python -c 'import dask; print(dask.__version__)')
    numpy: \$(python -c 'import numpy; print(numpy.__version__)')
    pandas: \$(python -c 'import pandas; print(pandas.__version__)')
END_VERSIONS
"""

NextFlow From line 25 of python/peak_reprod.nf

"""
plot_consensus_peaks.py \\
    --peaks "*.peaks.bed" \\
    --outpath .

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    python: \$(python --version | grep -E -o \"([0-9]{1,}\\.)+[0-9]{1,}\")
    numpy: \$(python -c 'import numpy; print(numpy.__version__)')
    pandas: \$(python -c 'import pandas; print(pandas.__version__)')
    upsetplot: \$(python -c 'import upsetplot; print(upsetplot.__version__)')
END_VERSIONS
"""

NextFlow From line 17 of python/plot_consensus_peaks.nf

"""
check_samplesheet.py $samplesheet samplesheet.valid.csv $params.use_control

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    python: \$(python --version | grep -E -o \"([0-9]{1,}\\.)+[0-9]{1,}\")
END_VERSIONS
"""

NextFlow From line 19 of python/samplesheet_check.nf

"""
samtools view $args -@ $task.cpus $bam | $args2 > ${prefix}.txt

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 24 of local/samtools_custom_view.nf

"""
bedtools \\
    bamtobed \\
    $args \\
    -i $bam \\
    > ${prefix}.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 23 of bamtobed/main.nf

"""
touch ${prefix}.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 38 of bamtobed/main.nf

"""
bedtools \\
    complement \\
    -i $bed \\
    -g $sizes \\
    $args \\
    > ${prefix}.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 25 of complement/main.nf

"""
bedtools \\
    genomecov \\
    -ibam $intervals \\
    $args \\
    > ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 32 of genomecov/main.nf

"""
bedtools \\
    genomecov \\
    -i $intervals \\
    -g $sizes \\
    $args \\
    > ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 45 of genomecov/main.nf

"""
touch  ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 62 of genomecov/main.nf

"""
bedtools \\
    intersect \\
    -a $intervals1 \\
    -b $intervals2 \\
    $args \\
    $sizes \\
    > ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 30 of intersect/main.nf

"""
touch ${prefix}.${extension}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 51 of intersect/main.nf

"""
bedtools \\
    merge \\
    -i $bed \\
    $args \\
    > ${prefix}.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 24 of merge/main.nf

"""
touch ${prefix}.bed

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
END_VERSIONS
"""

NextFlow BEDTools From line 39 of merge/main.nf

"""
INDEX=`find -L ./ -name "*.rev.1.bt2" | sed "s/\\.rev.1.bt2\$//"`
[ -z "\$INDEX" ] && INDEX=`find -L ./ -name "*.rev.1.bt2l" | sed "s/\\.rev.1.bt2l\$//"`
[ -z "\$INDEX" ] && echo "Bowtie2 index files not found" 1>&2 && exit 1

bowtie2 \\
    -x \$INDEX \\
    $reads_args \\
    --threads $task.cpus \\
    $unaligned \\
    $args \\
    2> ${prefix}.bowtie2.log \\
    | samtools $samtools_command $args2 --threads $task.cpus -o ${prefix}.${extension} -

if [ -f ${prefix}.unmapped.fastq.1.gz ]; then
    mv ${prefix}.unmapped.fastq.1.gz ${prefix}.unmapped_1.fastq.gz
fi

if [ -f ${prefix}.unmapped.fastq.2.gz ]; then
    mv ${prefix}.unmapped.fastq.2.gz ${prefix}.unmapped_2.fastq.gz
fi

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//')
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' )
END_VERSIONS
"""

NextFlow SAMtools Bowtie 2 From line 44 of align/main.nf

"""
touch ${prefix}.${extension}
touch ${prefix}.bowtie2.log
touch ${prefix}.unmapped_1.fastq.gz
touch ${prefix}.unmapped_2.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//')
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
    pigz: \$( pigz --version 2>&1 | sed 's/pigz //g' )
END_VERSIONS
"""

NextFlow From line 80 of align/main.nf

"""
mkdir bowtie2
bowtie2-build $args --threads $task.cpus $fasta bowtie2/${fasta.baseName}
cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//')
END_VERSIONS
"""

NextFlow Bowtie 2 From line 22 of build/main.nf

"""
mkdir bowtie2
touch bowtie2/${fasta.baseName}.{1..4}.bt2
touch bowtie2/${fasta.baseName}.rev.{1,2}.bt2

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    bowtie2: \$(echo \$(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*\$//')
END_VERSIONS
"""

NextFlow Bowtie 2 From line 32 of build/main.nf

"""
cat ${readList.join(' ')} > ${prefix}.merged.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//')
END_VERSIONS
"""

NextFlow From line 26 of fastq/main.nf

"""
cat ${read1.join(' ')} > ${prefix}_1.merged.fastq.gz
cat ${read2.join(' ')} > ${prefix}_2.merged.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//')
END_VERSIONS
"""

NextFlow From line 40 of fastq/main.nf

"""
touch ${prefix}.merged.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//')
END_VERSIONS
"""

NextFlow From line 57 of fastq/main.nf

"""
touch ${prefix}_1.merged.fastq.gz
touch ${prefix}_2.merged.fastq.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    cat: \$(echo \$(cat --version 2>&1) | sed 's/^.*coreutils) //; s/ .*\$//')
END_VERSIONS
"""

NextFlow From line 68 of fastq/main.nf

"""
samtools faidx $fasta
cut -f 1,2 ${fasta}.fai > ${fasta}.sizes

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    getchromsizes: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 24 of getchromsizes/main.nf

"""
touch ${fasta}.fai
touch ${fasta}.sizes

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    getchromsizes: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 35 of getchromsizes/main.nf

"""
computeMatrix \\
    $args \\
    --regionsFileName $bed \\
    --scoreFileName $bigwig \\
    --outFileName ${prefix}.computeMatrix.mat.gz \\
    --outFileNameMatrix ${prefix}.computeMatrix.vals.mat.tab \\
    --numberOfProcessors $task.cpus

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    deeptools: \$(computeMatrix --version | sed -e "s/computeMatrix //g")
END_VERSIONS
"""

NextFlow DeepTools From line 25 of computematrix/main.nf

"""
multiBamSummary bins \\
    $args \\
    $label \\
    --bamfiles ${bams.join(' ')} \\
    --numberOfProcessors $task.cpus \\
    --outFileName all_bam.bamSummary.npz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    deeptools: \$(multiBamSummary --version | sed -e "s/multiBamSummary //g")
END_VERSIONS
"""

NextFlow From line 23 of multibamsummary/main.nf

"""
plotCorrelation \\
    $args \\
    --corData $matrix \\
    --corMethod $resolved_method \\
    --whatToPlot $resolved_plot_type \\
    --plotFile ${prefix}.plotCorrelation.pdf \\
    --outFileCorMatrix ${prefix}.plotCorrelation.mat.tab

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    deeptools: \$(plotCorrelation --version | sed -e "s/plotCorrelation //g")
END_VERSIONS
"""

NextFlow From line 29 of plotcorrelation/main.nf

"""
plotFingerprint \\
    $args \\
    $extend \\
    --bamfiles ${bams.join(' ')} \\
    --plotFile ${prefix}.plotFingerprint.pdf \\
    --outRawCounts ${prefix}.plotFingerprint.raw.txt \\
    --outQualityMetrics ${prefix}.plotFingerprint.qcmetrics.txt \\
    --numberOfProcessors $task.cpus

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    deeptools: \$(plotFingerprint --version | sed -e "s/plotFingerprint //g")
END_VERSIONS
"""

NextFlow DeepTools From line 26 of plotfingerprint/main.nf

"""
plotHeatmap \\
    $args \\
    --matrixFile $matrix \\
    --outFileName ${prefix}.plotHeatmap.pdf \\
    --outFileNameMatrix ${prefix}.plotHeatmap.mat.tab

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    deeptools: \$(plotHeatmap --version | sed -e "s/plotHeatmap //g")
END_VERSIONS
"""

NextFlow DeepTools From line 24 of plotheatmap/main.nf

"""
plotPCA \\
    $args \\
    --corData $matrix \\
    --plotFile ${prefix}.plotPCA.pdf \\
    --outFileNameData ${prefix}.plotPCA.tab

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    deeptools: \$(plotPCA --version | sed -e "s/plotPCA //g")
END_VERSIONS
"""

NextFlow PCAtools From line 24 of plotpca/main.nf

"""
printf "%s %s\\n" $rename_to | while read old_name new_name; do
    [ -f "\${new_name}" ] || ln -s \$old_name \$new_name
done

fastqc \\
    $args \\
    --threads $task.cpus \\
    $renamed_files

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
END_VERSIONS
"""

NextFlow FastQC From line 28 of fastqc/main.nf

"""
touch ${prefix}.html
touch ${prefix}.zip

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    fastqc: \$( fastqc --version | sed -e "s/FastQC v//g" )
END_VERSIONS
"""

NextFlow FastQC From line 46 of fastqc/main.nf

"""
# Not calling gunzip itself because it creates files
# with the original group ownership rather than the
# default one for that user / the work directory
gzip \\
    -cd \\
    $args \\
    $archive \\
    > $gunzip

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    gunzip: \$(echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//')
END_VERSIONS
"""

NextFlow From line 23 of gunzip/main.nf

"""
touch $gunzip
cat <<-END_VERSIONS > versions.yml
"${task.process}":
    gunzip: \$(echo \$(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*\$//')
END_VERSIONS
"""

NextFlow From line 41 of gunzip/main.nf

"""
macs2 \\
    callpeak \\
    ${args_list.join(' ')} \\
    --gsize $macs2_gsize \\
    --format $format \\
    --name $prefix \\
    --treatment $ipbam \\
    $control

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    macs2: \$(macs2 --version | sed -e "s/macs2 //g")
END_VERSIONS
"""

NextFlow macs2 From line 38 of callpeak/main.nf

"""
picard \\
    -Xmx${avail_mem}M \\
    MarkDuplicates \\
    $args \\
    --INPUT $bam \\
    --OUTPUT ${prefix}.bam \\
    --REFERENCE_SEQUENCE $fasta \\
    --METRICS_FILE ${prefix}.MarkDuplicates.metrics.txt

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    picard: \$(echo \$(picard MarkDuplicates --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d:)
END_VERSIONS
"""

NextFlow Picard From line 33 of markduplicates/main.nf

"""
touch ${prefix}.bam
touch ${prefix}.bam.bai
touch ${prefix}.MarkDuplicates.metrics.txt

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    picard: \$(echo \$(picard MarkDuplicates --version 2>&1) | grep -o 'Version:.*' | cut -f2- -d:)
END_VERSIONS
"""

NextFlow From line 51 of markduplicates/main.nf

"""
preseq \\
    lc_extrap \\
    $args \\
    $paired_end \\
    -output ${prefix}.lc_extrap.txt \\
    $bam
cp .command.err ${prefix}.command.log

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    preseq: \$(echo \$(preseq 2>&1) | sed 's/^.*Version: //; s/Usage:.*\$//')
END_VERSIONS
"""

NextFlow preseq From line 26 of lcextrap/main.nf

"""
samtools \\
    faidx \\
    $fasta \\
    $args

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 25 of faidx/main.nf

"""
${fastacmd}
touch ${fasta}.fai

cat <<-END_VERSIONS > versions.yml

"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 40 of faidx/main.nf

"""
samtools \\
    flagstat \\
    --threads ${task.cpus} \\
    $bam \\
    > ${prefix}.flagstat

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 23 of flagstat/main.nf

"""
touch ${prefix}.flagstat

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 38 of flagstat/main.nf

"""
samtools \\
    idxstats \\
    --threads ${task.cpus-1} \\
    $bam \\
    > ${prefix}.idxstats

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 24 of idxstats/main.nf

"""
touch ${prefix}.idxstats

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 40 of idxstats/main.nf

"""
samtools \\
    index \\
    -@ ${task.cpus-1} \\
    $args \\
    $input

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 24 of index/main.nf

"""
touch ${input}.bai
touch ${input}.crai
touch ${input}.csi

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 38 of index/main.nf

"""
samtools sort \\
    $args \\
    -@ $task.cpus \\
    -o ${prefix}.bam \\
    -T $prefix \\
    $bam

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 25 of sort/main.nf

"""
touch ${prefix}.bam

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 41 of sort/main.nf

"""
samtools \\
    stats \\
    --threads ${task.cpus} \\
    ${reference} \\
    ${input} \\
    > ${prefix}.stats

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 25 of stats/main.nf

"""
touch ${prefix}.stats

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 41 of stats/main.nf

"""
samtools \\
    view \\
    --threads ${task.cpus-1} \\
    ${reference} \\
    ${readnames} \\
    $args \\
    -o ${prefix}.${file_type} \\
    $input \\
    $args2

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow SAMtools From line 38 of view/main.nf

"""
touch ${prefix}.bam
touch ${prefix}.cram

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    samtools: \$(echo \$(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*\$//')
END_VERSIONS
"""

NextFlow From line 57 of view/main.nf

"""
SEACR_1.3.sh \\
    $bedgraph \\
    $function_switch \\
    $args \\
    $prefix

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    seacr: $VERSION
    bedtools: \$(bedtools --version | sed -e "s/bedtools v//g")
    r-base: \$(echo \$(R --version 2>&1) | sed 's/^.*R version //; s/ .*\$//')
END_VERSIONS
"""

NextFlow BEDTools From line 27 of callpeak/main.nf

"""
bgzip  --threads ${task.cpus} -c $args $input > ${prefix}.${input.getExtension()}.gz
tabix $args2 ${prefix}.${input.getExtension()}.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//')
END_VERSIONS
"""

NextFlow tabix From line 25 of bgziptabix/main.nf

"""
touch ${prefix}.${input.getExtension()}.gz
touch ${prefix}.${input.getExtension()}.gz.tbi
touch ${prefix}.${input.getExtension()}.gz.csi

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    tabix: \$(echo \$(tabix -h 2>&1) | sed 's/^.*Version: //; s/ .*\$//')
END_VERSIONS
"""

NextFlow From line 37 of bgziptabix/main.nf

"""
bedClip \\
    $bedgraph \\
    $sizes \\
    ${prefix}.bedGraph

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    ucsc: $VERSION
END_VERSIONS
"""

NextFlow ucsc-bedclip From line 26 of bedclip/main.nf

"""
bedGraphToBigWig \\
    $bedgraph \\
    $sizes \\
    ${prefix}.bigWig

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    ucsc: $VERSION
END_VERSIONS
"""

NextFlow bedGraphToBigWig From line 26 of bedgraphtobigwig/main.nf

"""
touch ${prefix}.bigWig

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    ucsc: $VERSION
END_VERSIONS
"""

NextFlow From line 41 of bedgraphtobigwig/main.nf

"""
mkdir $prefix

## Ensures --strip-components only applied when top level of tar contents is a directory
## If just files or multiple directories, place all in prefix
if [[ \$(tar -taf ${archive} | grep -o -P "^.*?\\/" | uniq | wc -l) -eq 1 ]]; then
    tar \\
        -C $prefix --strip-components 1 \\
        -xavf \\
        $args \\
        $archive \\
        $args2
else
    tar \\
        -C $prefix \\
        -xavf \\
        $args \\
        $archive \\
        $args2
fi

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//')
END_VERSIONS
"""

NextFlow From line 25 of untar/main.nf

"""
mkdir $prefix
touch ${prefix}/file.txt

cat <<-END_VERSIONS > versions.yml
"${task.process}":
    untar: \$(echo \$(tar --version 2>&1) | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//')
END_VERSIONS
"""