Workflow Steps and Code Snippets
Whole exome sequencing snakemake workflow based on GATK best practice
640 641 642 643 | shell: """ bedtools slop -i {input.bed} -g {params.ref}.fai -b {params.padding} > {output} """ |
668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 | shell: ''' grep -vE "^@" {input.intervalslist} | awk -v OFS='\t' '$2=$2-1' | bedtools intersect -c -a {input.BED} -b - | cut -f6 > {output.target_overlap_counts} function restrict_to_overlaps() {{ # print lines from whole-genome file from loci with non-zero overlap # with target intervals WGS_FILE=$1 EXOME_FILE=$2 paste {output.target_overlap_counts} $WGS_FILE | grep -Ev "^0" | cut -f 2- > $EXOME_FILE echo "Generated $EXOME_FILE" }} restrict_to_overlaps {input.UD} {output.UD} restrict_to_overlaps {input.BED} {output.BED} restrict_to_overlaps {input.MU} {output.MU} ''' |
Snakemake based analysis pipeline to identify m6As from eCLIP data
369 370 371 372 373 374 375 376 377 378 | shell: """ preCount=`wc -l < results/*_IP.MpileupParser_MotifFrequency.xls`; postCount=`wc -l < results/*.m6aList.xls`; finalCount=`echo "scale = 3; 100 - ((($postCount - 1) / ($preCount - 1)) * 100)" | bc`; \ java -cp workflow/scripts/meclip/target/meCLIP.jar com.github.ajlabuc.meclip.ConfidenceCategorizer {input.xls} {} {params.prefix}.m6aList ${{finalCount}} && \ sort -k1,1 -k2,2n {params.prefix}.m6aList.bed > {output.bed} && \ bedtools intersect -a {output.bed} -b resources/*.geneNames.bed -wb -s -sorted > {params.prefix}.m6aList.sorted.annotated.bed && \ java -cp workflow/scripts/meclip/target/meCLIP.jar com.github.ajlabuc.meclip.BedAnnotator {params.prefix}.m6aList.sorted.annotated.bed {output.xls} && \ rm {params.prefix}.m6aList.sorted.annotated.bed && \ rm results/*_MotifFrequency.xls """ |
388 389 390 391 392 | shell: """ bedtools intersect -a {input.bed} -b resources/*.annotated.sorted.bed > results/metaPlotR/annot_m6a.sorted.bed -wo -s -sorted && \ perl workflow/scripts/metaPlotR/ --bed results/metaPlotR/annot_m6a.sorted.bed --regions resources/*.region_sizes.txt > {output.txt} 2> {log} """ |
kGWASflow is a Snakemake workflow for performing k-mers-based GWAS. (v1.2.3)
128 129 130 131 | shell: """ bedtools bamtobed -i {input.bam} > {output} 2> {log} """ |
187 188 189 190 | shell: """ bedtools bamtobed -i {input.bam} > {output.bed} 2> {log} """ |
209 210 211 212 | shell: """ bedtools merge -i {input.bed} -s -c 6 -o distinct > {output.merged_bed} 2> {log} """ |
A metatranscriptomic pipeline optimized for the study of microeukaryotes. (v1.0)
58 59 60 61 62 | shell: """ unset PERL5LIB bedtools getfasta -fi {input.merged} -bed {params.merged}.fasta.transdecoder.bed -fo {output.cds} """ |
131 132 133 134 135 | shell: """ unset PERL5LIB bedtools getfasta -fi {input.merged} -bed {params.merged}.fasta.transdecoder.bed -fo {output.cds} """ |
204 205 206 207 208 | shell: """ unset PERL5LIB bedtools getfasta -fi {input.merged} -bed {params.merged}.fasta.transdecoder.bed -fo {output.cds} """ |