Hi-C Scaffold Assembly Workflow with Snakemake

public public 1yr ago 0 bookmarks

HiC scaffold

A snakemake workflow that scaffold assembly using Hi-C data

To run

snakemake --cores [cpu] --use-conda

Code Snippets

6
7
8
9
shell:
    """
    cp {input} {output}
    """
19
20
21
22
23
shell:
    """
    bwa index {input}
    samtools faidx {input}
    """
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
shell:
    """
    bwa mem \
        -t {threads}\
        {input.assembly} \
        {input.R1}| \
        samtools view -@ {threads} -Sb - \
        > {output.R1_mapped}
    bwa mem \
        -t {threads}\
        {input.assembly} \
        {input.R2}| \
        samtools view -@ {threads} -Sb - \
        > {output.R2_mapped}
    """
65
66
67
68
69
70
71
72
73
74
75
shell:
    """
    samtools view -h -@ {threads} {input.R1_mapped} | \
        filter_five_end.pl | \
        samtools view -Sb -@ {threads} - \
        > {output.R1_5endFiltered}
    samtools view -h -@ {threads} {input.R2_mapped} | \
        filter_five_end.pl | \
        samtools view -Sb -@ {threads} - \
        > {output.R2_5endFiltered}
    """
90
91
92
93
94
95
shell:
    """
    two_read_bam_combiner.pl {input.R1_5endFiltered} {input.R2_5endFiltered} samtools {params.mapq_filter} | \
        samtools view -bS -@ {threads} -t {input.fai} - | \
        samtools sort -@ {threads} -o {output}
    """
107
108
109
110
111
112
113
114
115
116
117
shell:
    """
    picard MarkDuplicates \
        -Xmx{params.mem} -XX:-UseGCOverheadLimit \
        INPUT={input} \
        OUTPUT={output.bam} \
        METRICS_FILE={output.metric} \
        ASSUME_SORTED=TRUE \
        VALIDATION_STRINGENCY=LENIENT\
        REMOVE_DUPLICATES=TRUE
    """
128
129
130
131
shell:
    """
    samtools sort -@ {threads} -o {output} -n {input}
    """
15
16
17
18
19
20
21
22
shell:
    """
    yahs \
        --no-contig-ec \
        --no-mem-check \
        -o {params.prefix} \
        {input.assembly} {input.bam}
    """
34
35
36
37
38
39
40
41
shell:
    """
    (juicer pre {input.bin} {input.agp} {input.fai} | \
        sort -k2,2d -k6,6d -T ./ --parallel={threads} -S24G | \
        awk 'NF' \
        > alignments_sorted.txt.part) \
        && (mv alignments_sorted.txt.part {output})
    """
SnakeMake From line 34 of rules/yahs.smk
50
51
52
53
54
shell:
    """
    samtools faidx {input}
    cut -f 1,2 {input}.fai > {output}
    """
66
67
68
69
70
shell:
    """
    (java -jar -Xmx24G {params.juicer_tools_jar} pre {input.aln} out.hic.part {input.chrom_size}) \
    && (mv out.hic.part {output})
    """
SnakeMake From line 66 of rules/yahs.smk
ShowHide 8 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/ZexuanZhao/Hi-C-scaffolding
Name: hi-c-scaffolding
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...