Workflow Steps and Code Snippets

794 tagged steps and code snippets that match keyword BEDTools

Whole exome sequencing snakemake workflow based on GATK best practice

640
641
642
643
shell:
    """
    bedtools slop -i {input.bed} -g {params.ref}.fai -b {params.padding} > {output}
    """
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
shell:
    '''
    grep -vE "^@" {input.intervalslist} |
        awk -v OFS='\t' '$2=$2-1' |
        bedtools intersect -c -a {input.BED} -b - |
        cut -f6 > {output.target_overlap_counts}

    function restrict_to_overlaps() {{
        # print lines from whole-genome file from loci with non-zero overlap
        # with target intervals
        WGS_FILE=$1
        EXOME_FILE=$2
        paste {output.target_overlap_counts} $WGS_FILE |
            grep -Ev "^0" |
            cut -f 2- > $EXOME_FILE
        echo "Generated $EXOME_FILE"
    }}

    restrict_to_overlaps {input.UD} {output.UD}
    restrict_to_overlaps {input.BED} {output.BED}
    restrict_to_overlaps {input.MU} {output.MU}

    '''

Snakemake based analysis pipeline to identify m6As from eCLIP data

369
370
371
372
373
374
375
376
377
378
shell:
    """
    preCount=`wc -l < results/*_IP.MpileupParser_MotifFrequency.xls`; postCount=`wc -l < results/*.m6aList.xls`; finalCount=`echo "scale = 3; 100 - ((($postCount - 1) / ($preCount - 1)) * 100)" | bc`; \
    java -cp workflow/scripts/meclip/target/meCLIP.jar com.github.ajlabuc.meclip.ConfidenceCategorizer {input.xls} {params.name} {params.prefix}.m6aList ${{finalCount}} && \
    sort -k1,1 -k2,2n {params.prefix}.m6aList.bed > {output.bed} && \
    bedtools intersect -a {output.bed} -b resources/*.geneNames.bed -wb -s -sorted > {params.prefix}.m6aList.sorted.annotated.bed && \
    java -cp workflow/scripts/meclip/target/meCLIP.jar com.github.ajlabuc.meclip.BedAnnotator {params.prefix}.m6aList.sorted.annotated.bed {output.xls} && \
    rm {params.prefix}.m6aList.sorted.annotated.bed  && \
    rm results/*_MotifFrequency.xls
    """
388
389
390
391
392
shell:
    """
    bedtools intersect -a {input.bed} -b resources/*.annotated.sorted.bed > results/metaPlotR/annot_m6a.sorted.bed -wo -s -sorted && \
    perl workflow/scripts/metaPlotR/rel_and_abs_dist_calc.pl --bed results/metaPlotR/annot_m6a.sorted.bed --regions resources/*.region_sizes.txt > {output.txt} 2> {log}
    """

kGWASflow is a Snakemake workflow for performing k-mers-based GWAS. (v1.2.3)

128
129
130
131
shell:
    """
    bedtools bamtobed -i {input.bam} > {output} 2> {log}
    """
187
188
189
190
shell:
    """
    bedtools bamtobed -i {input.bam} > {output.bed} 2> {log}
    """
209
210
211
212
shell:
    """
    bedtools merge -i {input.bed} -s -c 6 -o distinct > {output.merged_bed} 2> {log}
    """

A metatranscriptomic pipeline optimized for the study of microeukaryotes. (v1.0)

58
59
60
61
62
shell:
    """
    unset PERL5LIB
    bedtools getfasta -fi {input.merged} -bed {params.merged}.fasta.transdecoder.bed -fo {output.cds}
    """
131
132
133
134
135
shell:
    """
    unset PERL5LIB
    bedtools getfasta -fi {input.merged} -bed {params.merged}.fasta.transdecoder.bed -fo {output.cds}
    """
204
205
206
207
208
shell:
    """
    unset PERL5LIB
    bedtools getfasta -fi {input.merged} -bed {params.merged}.fasta.transdecoder.bed -fo {output.cds}
    """
tool / biotools

BEDTools

BEDTools is an extensive suite of utilities for comparing genomic features in BED format.