Sparse Signaling Pathway Sampling: MCMC for signaling pathway inference

public public 1yr ago Version: v0.1.4 0 bookmarks

Sparse Signaling Pathway Sampling

Code related to the manuscript Inferring signaling pathways with probabilistic programming (Merrell & Gitter, 2020) Bioinformatics, 36:Supplement_2, i822–i830.

This repository contains the following:

  • SSPS : A method that infers relationships between variables using time series data.

    • Modeling assumption: the time series data is generated by a Dynamic Bayesian Network (DBN).

    • Inference strategy: MCMC sampling over possible DBN structures.

    • Implementation: written in Julia, using the Gen probabilistic programming language

  • Analysis code:

    • simulation studies;

    • convergence analyses;

    • evaluation on experimental data;

    • a Snakefile for managing all of the analyses.

Installation and basic setup

(If you plan to reproduce all of the analyses, then make sure you're on a host with access to plenty of CPUs. Ideally, you would have access to a cluster of some sort.)

  1. Clone this repository
git clone git@github.com:gitter-lab/ssps.git
  1. Install Julia 1.6 (and all Julia dependencies)

    • Download the correct Julia binary here: https://julialang.org/downloads/.
      E.g., for Linux x86_64:
    $ wget https://julialang-s3.julialang.org/bin/linux/x64/1.6/julia-1.6.7-linux-x86_64.tar.gz 
    $ tar -xvzf julia-1.6.7-linux-x86_64.tar.gz
    
    • Find additional installation instructions here: https://julialang.org/downloads/platform/.

    • Use Pkg -- Julia's package manager -- to install the project's julia dependencies:

    $ cd ssps/SSPS
    $ julia --project=. 
     _
     _ _ _(_)_ | Documentation: https://docs.julialang.org
     (_) | (_) (_) |
     _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
     | | | | | | |/ _` | |
     | | |_| | | | (_| | | Version 1.6.7 (2022-07-19)
     _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
    |__/ |
    julia> using Pkg
    julia> Pkg.instantiate()
    julia> exit()
    

Reproducing the analyses

In order to reproduce the analyses, you will need some extra bits of software.

  • We use Snakemake -- a python package -- to manage the analysis workflow.

  • We use some other python packages to postprocess the results, produce plots, etc.

  • Some of the baseline methods are implemented in R or MATLAB.

Hence, the analyses entail some extra setup:

  1. Install python dependencies (using conda )

    • For the purposes of these instructions, we assume you have Anaconda3 or Miniconda3 installed, and have access to the conda environment manager.
      (We recommend using Miniconda ; find full installation instructions here .)

    • We recommend setting up a dedicated virtual environment for this project. The following will create a new environment named ssps and install the required python packages:

    $ conda create -n ssps -c conda-forge pandas matplotlib numpy bioconda::snakemake-minimal
    $ conda activate ssps
    (ssps) $
    
    • If you plan to reproduce the analyses on a cluster , then install cookiecutter and the complete version of snakemake
    (ssps) $ conda install -c conda-forge cookiecutter bioconda::snakemake
    

    and find the appropriate Snakemake profile from this list: https://github.com/Snakemake-Profiles/doc install the Snakemake profile using cookiecutter:

    (ssps) $ cookiecutter https://github.com/Snakemake-Profiles/htcondor.git
    

    replacing the example with the desired profile.

  2. Install R packages

  3. Check whether MATLAB is installed.

After completing this additional setup, we are ready to run the analyses .

  1. Make any necessary modifications to the configuration file: analysis_config.yaml . This file controls the space of hyperparameters and datasets explored in the analyses.

  2. Run the analyses using snakemake :

    • If you're running the analyses on your local host, simply move to the directory containing Snakefile and call snakemake .
    (ssps) $ cd ssps
    (ssps) $ snakemake
    
    • Since Julia is a dynamically compiled language, some time will be devoted to compilation when you run SSPS for the first time. You may see some warnings in stdout -- this is normal.

    • If you're running the analyses on a cluster, call snakemake with the same Snakemake profile you found here :

    (ssps) $ cd ssps
    (ssps) $ snakemake --profile YOUR_PROFILE_NAME
    

    (You will probably need to edit the job submission parameters in the profile's config.yaml file.)

  3. Relax. It will take tens of thousands of cpu-hours to run all of the analyses.

Running SSPS on your data

Follow these steps to run SSPS on your dataset. You will need

  • a CSV file (tab separated) containing your time series data

  • a CSV file (comma separated) containing your prior edge confidences.

  • Optional: a JSON file containing a list of variable names (i.e., node names).

  1. Install the python dependencies if you haven't already. Find detailed instructions above.

  2. cd to the run_ssps directory

  3. Configure the parameters in ssps_config.yaml as appropriate

  4. Run Snakemake: $ snakemake --cores 1 . Increase 1 to increase the maximum number of CPU cores to be used.

A note about parallelism

SSPS allows two levels of parallelism: (1) at the Markov chain level and (2) at the iteration level.

  • Chain-level parallelism is provided via Snakemake. For example, Snakemake can run 4 chains simultaneously if you specify --cores 4 at the command line: $ snakemake --cores 4 . In essence, this just creates 4 instances of SSPS that run simultaneously.

  • Iteration-level parallelism is provided by Julia's multi-threading features . The number of threads available to a SSPS instance is specified by an environment variable: JULIA_NUM_THREADS .

  • The total number of CPUs used by your SSPS jobs is the product of Snakemake's --cores parameter and Julia's JULIA_NUM_THREADS environment variable. Concretely: if we run snakemake --cores 2 and have JULIA_NUM_THREADS=4 , then up to 8 CPUs may be used at one time by the SSPS jobs.

Licenses

SSPS is available under the MIT License , Copyright © 2020 David Merrell.

The MATLAB code dynamic_network_inference.m has been modified from the original version , Copyright © 2012 Steven Hill and Sach Mukherjee.

The dream-challenge data is described in Hill et al., 2016 and is originally from Synapse .

Code Snippets

 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import pandas as pd
import numpy as np
import argparse
import json

def build_weighted_adj(eda_filename):

    df = pd.read_csv(eda_filename, sep=" ")
    df.reset_index(inplace=True)
    antibodies = df["level_0"].unique()
    print("ANTIBODIES: ", antibodies)
    antibody_map = { a:i for (i,a) in enumerate(antibodies) }

    V = len(antibody_map)
    adj = np.zeros((V,V))

    for (_, row) in df.iterrows():
        a = row["level_0"]
        b = row["level_2"]
        adj[antibody_map[a],antibody_map[b]] = row["EdgeScore"]

    print(adj)

    antibody_ls = [0 for i in antibody_map]
    for (name, idx) in antibody_map.items():
        antibody_ls[idx] = name

    return adj, antibody_ls



if __name__=="__main__":

    parser = argparse.ArgumentParser(description="")
    parser.add_argument("eda_file", help="path to a DREAM challenge time series CSV file")
    parser.add_argument("output_file", help="path where the output CSV will be written")
    parser.add_argument("antibody_file", help="path to output JSON file containing the indices of antibodies")
    args = parser.parse_args()

    adj_mat, antibody_ls = build_weighted_adj(args.eda_file)

    df = pd.DataFrame(adj_mat)
    df.to_csv(args.output_file, sep=",", index=False, header=False) 

    json.dump(antibody_ls, open(args.antibody_file, "w"))
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
import pandas as pd
import os
import argparse
import numpy as np

EXCLUDE = {"foxo3a_ps318_s321", "taz_ps89"}


def to_minutes(timestr):
    """
    Convert a time string (e.g., '10min') to a floating point
    number of minutes (e.g., 60.0)
    """

    if timestr[-3:] == "min":
        num = float(timestr[:-3])
    elif timestr[-2:] == "hr":
        num = float(timestr[:-2]) * 60.0

    return num


def get_antibody_row(df):
    for i, row in df.iterrows():
        if "Antibody Name" in row.values:
            return i
    return -1


def get_start_idxs(df):
    for i, row in df.iterrows():
        cols = np.where(row.values == "Timepoint")
        if len(cols[0]) > 0:
            return i, cols[0][0]+1
    return -1, -1


def load_dream_ts(csv_path, keep_start=False):
    """
    Read in a DREAM challenge time series CSV file
    and return a DataFrame with appropriate columns.
    """

    # the original CSV is strangely formatted -- it has
    # an extra column and a multi-line header.
    df = pd.read_csv(csv_path)
    df.drop("Unnamed: 0", axis=1, inplace=True)

    antibody_row = get_antibody_row(df)
    data_start_row, data_start_col = get_start_idxs(df)

    df.iloc[data_start_row, data_start_col:] = df.iloc[antibody_row, data_start_col:].values
    df.columns = df.loc[data_start_row,:].values
    df = df.loc[(data_start_row + 1):,:]
    df.index = range(df.shape[0])

    # The original format doesn't give a "Stimulus" label
    # at timepoint 0; we'll restore the label if necessary
    if keep_start:
        print("TOO BAD.") 
        df = df[df["Stimulus"].isnull() == False]
    else:
        # Otherwise, just remove these rows.
        df = df[df["Stimulus"].isnull() == False]

    return df


def create_standard_dataframe(dream_df, ignore_stim=False,
                                        ignore_inhib=False):
    """
    For each context contained in `dream_df`, create a time series
    dataframe.
    """

    context_cols = []
    if not ignore_inhib:
        context_cols.append("Inhibitor")
    if not ignore_stim:
        context_cols.append("Stimulus")

    joiner = lambda x: "_".join(x)

    dream_df["context"] = df[context_cols].apply(joiner, axis=1) 
    contexts = dream_df["context"].unique()

    dream_df.rename(columns={"Timepoint": "timestep"}, inplace=True)
    dream_df.loc[:,"timeseries"] = dream_df[["Inhibitor","Stimulus"]].apply(joiner, axis=1)

    dream_df.sort_values(["context","timeseries","timestep"], inplace=True)

    keep_cols = ["context", "timeseries", "timestep"]
    idx_cols = keep_cols+["Inhibitor", "Stimulus"]

    # IMPORTANT: standard order of variables = lexicographic
    var_cols = [c for c in dream_df.columns if c not in idx_cols]
    var_cols = [c for c in var_cols if c.lower() not in EXCLUDE] 

    dream_df = dream_df[keep_cols + var_cols]

    dream_df = dream_df.astype({v:"float64" for v in var_cols})
    dream_df[var_cols] = dream_df[var_cols].applymap(np.log)

    # Deduplicate by taking means... not sure if this is the right way to go
    gp = dream_df.groupby(keep_cols)
    dream_df = gp.mean()
    dream_df.reset_index(inplace=True)

    return dream_df 


if __name__=="__main__":

    # Get command line args
    parser = argparse.ArgumentParser(description="")
    parser.add_argument("timeseries_file", help="path to a DREAM challenge time series CSV file")
    parser.add_argument("output_dir", help="directory where the output CSVs will be written")
    parser.add_argument("--ignore-stim", help="Do NOT treat different stimuli as different contexts.",
                        action="store_true")
    parser.add_argument("--ignore-inhibitor", help="Do NOT treat different inhibitors as different contexts.", 
                        action="store_true")
    parser.add_argument("--keep-start", help="Keep the time series data at timepoint 0",
                        action="store_true")
    args = parser.parse_args()

    ts_filename = str(args.timeseries_file)
    ignore_stim = args.ignore_stim
    ignore_inhib = args.ignore_inhibitor

    # Load the DREAM challenge data
    df = load_dream_ts(ts_filename, keep_start=args.keep_start)

    # transform these columns into more useful forms
    df["Timepoint"] = df["Timepoint"].map(to_minutes)
    df.loc[df["Inhibitor"].isnull(), "Inhibitor"] = "nothing"
    df.loc[df["Stimulus"].isnull(), "Stimulus"] = "nothing"

    # Convert the data to (context-specific) time series dataframes,
    # formatted correctly for our analysis
    new_ts_df = create_standard_dataframe(df, ignore_stim=ignore_stim, 
                                              ignore_inhib=ignore_inhib)

    in_fname = os.path.basename(ts_filename)
    cell_line = in_fname.split("_")[0] 

    contexts = new_ts_df["context"].unique()

    for ctxt in contexts:
        ctxt_str = "cl={}_stim={}".format(cell_line, ctxt)
        out_df = new_ts_df[new_ts_df["context"] == ctxt]
        out_df.iloc[:,1:].to_csv(os.path.join(str(args.output_dir), ctxt_str+".csv"),
                                 sep="\t", index=False)
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
import pandas as pd
import numpy as np
import matplotlib as mpl

mpl.use('Agg')
mpl.rcParams['text.usetex'] = True

from matplotlib import pyplot as plt
import sys
import os
import argparse
import script_util as su



def compute_t_statistics(df, test_key_cols, sample_col, qty_cols,
                             method_col, baseline_name):

    methods = df[method_col].unique().tolist()
    baseline = df[df[method_col] == baseline_name]

    key_arrs = [df[k].unique() for k in test_key_cols+[method_col]]

    df.set_index(test_key_cols+[method_col], inplace=True)
    baseline.set_index(test_key_cols, inplace=True)

    result_df = pd.DataFrame(index=pd.MultiIndex.from_product(key_arrs),
                             columns=qty_cols)

    for ks in result_df.index:

        diffs = df.loc[ks, qty_cols] - baseline.loc[ks[:-1], qty_cols].values
        diffs.reset_index(inplace=True)

        gp = diffs.groupby(by=test_key_cols)
        means = gp[qty_cols].mean()
        stds = gp[qty_cols].std()
        n = means.shape[0]

        f = lambda x: x / np.sqrt(n)
        ses = stds.apply(f)
        ts = means / ses

        result_df.loc[ks, qty_cols] = ts.values

    result_df.index.rename(test_key_cols+[method_col], inplace=True)
    result_df.reset_index(inplace=True)
    return result_df


def aggregate_scores(table, key_cols, score_cols):

    gp = table.groupby(key_cols)
    agg = gp[score_cols].mean()
    agg.reset_index(inplace=True)

    return agg 


def make_heatmap(ax, relevant, x_col, y_col, qty_col, **kwargs):

    x_vals = relevant[x_col].unique()
    y_vals = relevant[y_col].unique()

    grid = np.zeros((len(y_vals), len(x_vals)))
    for i, x in enumerate(x_vals):
        for j, y in enumerate(y_vals):
            grid[j,i] = relevant.loc[(relevant[x_col] == x) & (relevant[y_col] == y), qty_col]

    img = ax.imshow(grid, origin="lower", **kwargs)
    ax.set_xticks(list(range(len(x_vals))))
    ax.set_xticklabels(x_vals)
    ax.set_yticks(list(range(len(y_vals))))
    ax.set_yticklabels(y_vals)
    #ax.set_xlim([-0.5, len(x_vals)-0.5])
    #ax.set_ylim([-0.5, len(y_vals)-0.5])

    #ax.label_outer()
    return img


def subplot_heatmaps(qty_df, macro_x_col, macro_y_col, 
                     micro_x_col, micro_y_col, qty_col, score_str,
                     output_filename="simulation_scores.png",
                     macro_x_vals=None, macro_y_vals=None,
                     cmap="Greys", vmin=None, vmax=None):

    if macro_x_vals is None:
        macro_x_vals = qty_df[macro_x_col].unique().tolist()

    if macro_y_vals is None:
        macro_y_vals = qty_df[macro_y_col].unique().tolist()

    n_rows = len(macro_y_vals)
    n_cols = len(macro_x_vals)

    fig, axarr = plt.subplots(n_rows, n_cols, 
                              sharey=True, sharex=True, 
                              figsize=(2.0*n_cols,2.0*n_rows))


    in_macro_y_vals = lambda x: x in macro_y_vals
    relevant_scores = qty_df.loc[qty_df[macro_y_col].map(in_macro_y_vals) , qty_col]

    if vmin is None:
        vmin = relevant_scores.quantile(0.05)
    if vmax is None:
        vmax = relevant_scores.quantile(0.95)

    nrm = mpl.colors.Normalize(vmin=vmin,vmax=vmax)
    mappable = mpl.cm.ScalarMappable(norm=nrm, cmap=cmap)

    imgs = []

    # Iterate through the different subplots
    for i, myv in enumerate(macro_y_vals):
        for j, psize in enumerate(macro_x_vals):

            ax = axarr[i][j]
            relevant = qty_df.loc[(qty_df[macro_y_col] == myv) & (qty_df[macro_x_col] == psize),:]

            img = make_heatmap(ax, relevant, micro_x_col, micro_y_col, qty_col, norm=nrm, cmap=cmap)
            imgs.append(img)

            #ax.set_xlim([0,3])
            #ax.set_ylim([0,3])
            if i == len(macro_y_vals)-1:
                ax.set_xlabel("${}$\n$V$ = {:d}".format(micro_x_col, int(psize)),family='serif')
            if j == 0:
                ax.set_ylabel("{}\n${}$".format(su.NICE_NAMES[myv], micro_y_col),family='serif')


    fig.suptitle("Simulation Study: {}".format(su.NICE_NAMES[score_str]),family='serif',fontsize=16)
    plt.tight_layout(rect=[0.0,0.0,1,0.95])
    fig.colorbar(imgs[-1], ax=axarr, location="top", shrink=0.8, pad=0.05, fraction=0.05, use_gridspec=True)

    plt.savefig(output_filename, dpi=300)#, bbox_inches="tight")


if __name__=="__main__":

    args = sys.argv
    infile = args[1]
    mean_outfile = args[2]
    t_outfile = args[3]
    score_str = args[4]
    baseline_name = args[5]
    methods = args[6:]

    table = pd.read_csv(infile, sep="\t") 

    key_cols = ["v","r","a"]
    sample_col = "replicate"
    score_cols = [score_str]
    method_col = "method"

    aggregate_table = aggregate_scores(table, key_cols + [method_col], score_cols)
    #aggregate_table.to_csv("means.tsv", sep="\t")

    t_stat_table = compute_t_statistics(table, key_cols, sample_col, score_cols,
                                        method_col, baseline_name)
    #t_stat_table.to_csv("t_statistics.tsv", sep="\t")

    print(mean_outfile)
    subplot_heatmaps(aggregate_table, "v", "method", "r", "a", score_str, score_str,
                     output_filename=mean_outfile, macro_y_vals=methods+[baseline_name],
                     cmap="Greys")

    print(t_outfile)
    subplot_heatmaps(t_stat_table, "v", "method", "r", "a", score_str, "t_stat_{}".format(score_str), 
                     output_filename=t_outfile, macro_y_vals=methods,
                     cmap="RdBu", vmin=-5.0, vmax=5.0) 
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import script_util as su
import pandas as pd
import sys
import numpy as np
import os

if __name__=="__main__":

    input_files = sys.argv[1:-1]
    output_file = sys.argv[-1]

    AUCPR_STR = "aucpr"
    AUCROC_STR = "aucroc"

    table = su.tabulate_results(input_files, [[AUCPR_STR],[AUCROC_STR]])

    methods = [f.split(os.path.sep)[-2] for f in input_files]
    table["method"] = methods

    table.to_csv(output_file, index=False, sep="\t")
148
149
shell:
    "python scripts/tabulate_scores.py {input.scores} {output}"
SnakeMake From line 148 of master/Snakefile
159
160
shell:
    "python scripts/tabulate_scores.py {input.mcmc} {input.baselines} {output}"
SnakeMake From line 159 of master/Snakefile
173
174
shell:
    "julia --project={JULIA_PROJ_DIR} {input.simulator} {wildcards.v} {wildcards.t} {SIM_M} {wildcards.r} {wildcards.a} {POLY_DEG} {output.ref} {output.true} {output.ts}"
SnakeMake From line 173 of master/Snakefile
187
188
shell:
    "julia --project={JULIA_PROJ_DIR} {input.scorer} --truth-file {input.tr_dg} --pred-file {input.pp_res} --output-file {output.out}"
SnakeMake From line 187 of master/Snakefile
200
201
shell:
    "python scripts/sim_heatmap.py {input} {output.mean} {output.t} {wildcards.score} prior_baseline {SIM_METHODS}" 
SnakeMake From line 200 of master/Snakefile
227
228
shell:
    "julia --project={JULIA_PROJ_DIR} {input.pp} --chain-samples {input.raw} --output-file {output.out} --burnin {CONV_BURNIN}"
SnakeMake From line 227 of master/Snakefile
242
243
244
245
shell:
    "julia --project={JULIA_PROJ_DIR} {input.method} {input.ts_file} {input.ref_dg} {output} {CONV_TIMEOUT}"\
    +" --n-steps {CONV_MAX_SAMPLES} --regression-deg {wildcards.d}"\
    +" --lambda-prop-std {wildcards.lstd}"
SnakeMake From line 242 of master/Snakefile
259
260
shell:
    "julia --project={JULIA_PROJ_DIR} {input.pp} --chain-samples {input.raw}  --output-file {output.out}"
SnakeMake From line 259 of master/Snakefile
273
274
275
276
shell:
    "julia --project={JULIA_PROJ_DIR} {input.method} {input.ts_file} {input.ref_dg} {output} {SIM_TIMEOUT}"\
    +" --regression-deg {wildcards.d} --n-steps {SIM_MAX_SAMPLES}"\
    +" --lambda-prop-std 3.0 --large-indeg 15.0"
SnakeMake From line 273 of master/Snakefile
293
294
shell:
    "julia --project={JULIA_PROJ_DIR} {input.pp} --chain-samples {input.raw}  --output-file {output.out}"
SnakeMake From line 293 of master/Snakefile
308
309
310
311
shell:
    "julia --project={JULIA_PROJ_DIR} {input.method} {input.ts_file} {input.ref_dg} {output} {SIM_TIMEOUT}"\
    +" --regression-deg 1 --n-steps {SIM_MAX_SAMPLES}"\
    +" --lambda-prop-std 3.0 --large-indeg 15.0 --proposal uniform"
SnakeMake From line 308 of master/Snakefile
330
331
shell:
    "Rscript {FUNCH_DIR}/funchisq_wrapper.R {input.ts_file} {output}"
SnakeMake From line 330 of master/Snakefile
353
354
shell:
    "matlab -nodesktop -nosplash -nojvm -singleCompThread -r \'cd(\"{HILL_DIR}\"); try, hill_dbn_wrapper(\"{input.ts_file}\", \"{input.ref_dg}\", \"{output}\", -1, \"auto\", {SIM_TIMEOUT}), catch e, quit(1), end, quit\'"
SnakeMake From line 353 of master/Snakefile
367
368
shell:
    "matlab -nodesktop -nosplash -nojvm -singleCompThread -r \'cd(\"{HILL_DIR}\"); try, hill_dbn_wrapper(\"{input.ts}\", \"{input.ref}\", \"{output}\", {wildcards.deg}, \"{wildcards.mode}\", {HILL_TIME_TIMEOUT}), catch e, quit(1), end, quit\'"
SnakeMake From line 367 of master/Snakefile
376
377
shell:
    "python {SCRIPT_DIR}/tabulate_timetest_results.py {input} {output}"
SnakeMake From line 376 of master/Snakefile
396
397
shell:
    "julia --project={JULIA_PROJ_DIR} {input.method} {input.ts} {input.ref} {output}"
SnakeMake From line 396 of master/Snakefile
414
415
shell:
    "julia --project={JULIA_PROJ_DIR} {input.method} {input.ref} {output}"
SnakeMake From line 414 of master/Snakefile
444
445
shell:
    "python {input.scorer} {input.preds} {input.tr_desc} {input.ab} {output.out}"
SnakeMake From line 444 of master/Snakefile
459
460
461
shell:
    "julia --project={JULIA_PROJ_DIR} {input.pp} --chain-samples {input.raw} --output-file {output.out}"\
    +" --stop-points {DREAM_STOPPOINTS}"
SnakeMake From line 459 of master/Snakefile
475
476
477
478
shell:
    "julia --project={JULIA_PROJ_DIR} {input.method} {input.ts_file} {input.ref_dg} {output} {DREAM_TIMEOUT}"\
    +" --n-steps {CONV_MAX_SAMPLES} --regression-deg {wildcards.d}"\
    +" --lambda-prop-std {wildcards.lstd} --large-indeg {MCMC_INDEG}"
SnakeMake From line 475 of master/Snakefile
492
493
shell:
    "python {input.scorer} {input.preds} {input.tr_desc} {input.ab} {output.out}"
SnakeMake From line 492 of master/Snakefile
506
507
shell:
    "Rscript {input.method} {input.ts_file} {output}"
SnakeMake From line 506 of master/Snakefile
520
521
shell:
    "matlab -nodesktop -nosplash -nojvm -singleCompThread -r \'cd(\"{HILL_DIR}\"); try, hill_dbn_wrapper(\"{input.ts_file}\", \"{input.ref_dg}\", \"{output}\", -1, \"auto\", {SIM_TIMEOUT}), catch e, quit(1), end, quit\'"
SnakeMake From line 520 of master/Snakefile
535
536
shell:
    "julia --project={JULIA_PROJ_DIR} {input.method} {input.ts} {input.ref} {output}"
SnakeMake From line 535 of master/Snakefile
549
550
shell:
    "julia --project={JULIA_PROJ_DIR} {input.method} {input.ref} {output}"
SnakeMake From line 549 of master/Snakefile
558
559
shell:
    "python scripts/preprocess_dream_ts.py {input} {DREAM_PREP_TS_DIR} --ignore-inhibitor"
SnakeMake From line 558 of master/Snakefile
568
569
shell:
    "python scripts/preprocess_dream_prior.py {input} {output.edges} {output.ab}"
SnakeMake From line 568 of master/Snakefile
ShowHide 29 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/gitter-lab/ssps
Name: ssps
Version: v0.1.4
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: MIT License
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...