pyIPSA: Integrative Splicing Analysis Pipeline

public public 1yr ago 0 bookmarks

pyIPSA

Integrative Pipeline for Splicing Analysis

Installation & Run

Step 1: Obtain a copy of this workflow

Clone this repository to your local system, into the

Code Snippets

 6
 7
 8
 9
10
shell:
    """
    wget -O {output.genome}.gz {params.url}
    gunzip {output.genome}.gz
    """
17
18
shell:
    """python3 -c 'import pysam; pysam.index("{input.bam}")'"""
36
37
38
39
40
41
42
43
shell:
    "python3 -m workflow.scripts.count_junctions "
    "-i {input.bam} "
    "-k {input.known} "
    "-o {output.junctions} "
    "-l {output.library_stats} "
    "{params.primary} {params.unique} "
    "-t {threads}"
52
53
54
55
shell:
    "python3 -m workflow.scripts.gather_library_stats "
    "{OUTPUT_DIR}/J1  "
    "-o {output.tsv}"
71
72
73
74
75
76
77
78
shell:
    "python3 -m workflow.scripts.aggregate_junctions "
    "-i {input.junctions} "
    "-s {input.library_stats} "
    "-o {output.aggregated_junctions} "
    "--min_offset {params.min_offset} "
    "--min_intron_length {params.min_intron_length} "
    "--max_intron_length {params.max_intron_length}"
94
95
96
97
98
99
shell:
     "python3 -m workflow.scripts.annotate_junctions "
     "-i {input.aggregated_junctions} "
     "-k {input.known_sj} "
     "-f {input.genome} "
     "-o {output.annotated_junctions}"
110
111
112
113
114
115
shell:
    "python3 -m workflow.scripts.choose_strand "
    "-i {input.annotated_junctions} "
    "-r {input.ranked_list} "
    "-o {output.stranded_junctions} "
    "-s {output.junction_stats}"
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
run:
    d = defaultdict(list)
    for replicate in input.junction_stats:
        p = Path(replicate)
        name = Path(p.stem).stem
        with p.open("r") as f:
            d["replicate"].append(name)
            for line in f:
                if line.startswith("-"):
                    break
                left, right = line.strip().split(": ")
                d[left].append(right)
    df = pd.DataFrame(d)
    if not df.empty:
            df.sort_values(by="replicate")
    df.to_csv(output.tsv, index=False, sep="\t")
151
152
153
154
155
156
157
shell:
     "python3 -m workflow.scripts.filter "
     "-i {input.stranded_junctions} "
     "-e {params.entropy} "
     "-c {params.total_count} "
     "{params.gtag} "
     "-o {output.filtered_junctions}"
166
167
168
169
shell:
     "python3 -m workflow.scripts.merge_junctions "
     "{input.stranded_junctions} "
     "-o {output.merged_junctions}"
13
14
15
16
17
18
shell:
    "python3 -m workflow.scripts.count_polyA "
    "-i {input.bam} "
    "-o {output.polyA} "
    "{params.primary} {params.unique} "
    "-t {threads}"
30
31
32
33
34
35
shell:
    "python3 -m workflow.scripts.aggregate_polyA "
    "-i {input.polyA} "
    "-s {input.library_stats} "
    "-o {output.aggregated_polyA} "
    "--min_overhang {params.min_overhang} "
14
15
16
17
18
19
20
21
shell:
    "python3 -m workflow.scripts.count_sites "
    "-i {input.bam} "
    "-j {input.junctions} "
    "-s {input.stats} "
    "-o {output.pooled_sites} "
    "{params.primary} {params.unique} "
    "-t {threads}"
33
34
35
36
37
38
shell:
    "python3 -m workflow.scripts.aggregate_sites "
    "-i {input.sites} "
    "-s {input.stats} "
    "-o {output.aggregated_pooled_sites} "
    "-m {params.min_offset}"
50
51
52
53
54
55
56
shell:
     "python3 -m workflow.scripts.filter "
     "-i {input.aggregated_pooled_sites} "
     "--sites "
     "-e {params.entropy} "
     "-c {params.total_count} "
     "-o {output.filtered_pooled_sites}"
16
17
18
19
20
21
22
23
shell:
    "python3 -m workflow.scripts.count_sites "
    "-i {input.bam} "
    "-j {input.junctions} "
    "-s {input.stats} "
    "-o {output.sites} "
    "{params.primary} {params.unique} "
    "-t {threads}"
35
36
37
38
39
40
shell:
    "python3 -m workflow.scripts.aggregate_sites "
    "-i {input.sites} "
    "-s {input.stats} "
    "-o {output.aggregated_sites} "
    "-m {params.min_offset}"
52
53
54
55
56
57
58
shell:
     "python3 -m workflow.scripts.filter "
     "-i {input.aggregated_sites} "
     "--sites "
     "-e {params.entropy} "
     "-c {params.total_count} "
     "-o {output.filtered_sites}"
51
52
53
54
55
shell:
     "python3 -m workflow.scripts.compute_rates "
     "-j {input.filtered_junctions} "
     "-s {input.filtered_sites} "
     "-o {output.rates}"
65
66
67
68
69
shell:
     "python3 -m workflow.scripts.compute_rates "
     "-j {input.filtered_junctions} "
     "-s {input.filtered_pooled_sites} "
     "-o {output.rates}"
ShowHide 20 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/pervouchinelab/pyIPSA
Name: pyipsa
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Other Versions:
Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...