A computational method to generate causal explanations for proteomic profiles using prior mechanistic knowledge in the literature, as recorded in cellular pathway maps.

public public 1yr ago 0 bookmarks

This is a tool for pathway analysis of proteomic and phosphoproteomic datasets. CausalPath aims to identify mechanistic pathway relations that can explain observed correlations in experiments

Additional information about CausalPath can be found @ https://github.com/PathwayAndDataAnalysis/causalpath

A work-in-progress manuscript describing this method is available here .

Usage

Step 1: Install workflow

If you simply want to use this workflow, download and extract the latest release . If you intend to modify and further develop this workflow, fork this reposity. Please consider providing any generally applicable modifications via a pull request.

In any case, if you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this repository and, once available, its DOI.

Step 2: Configure workflow

Configure the workflow according to your needs via editing the file omic_config.yaml .

Step 3: Execute workflow

All you need to execute this workflow is to install Snakemake via the Conda package manager . Software needed by this workflow is automatically deployed into isolated environments by Snakemake.

Test your configuration by performing a dry-run via

snakemake --use-conda -n

Execute the workflow locally via

snakemake --use-conda --cores $N

using $N cores. Alternatively, it can be run in cluster or cloud environments (see the docs for details).

If you not only want to fix the software stack but also the underlying OS, use

snakemake --use-conda --use-singularity

in combination with any of the modes above.

Step 4: Investigate results

After successful execution, you can create a self-contained report with all results via:

snakemake --report report.html

Code Snippets

16
17
script:
    "../scripts/partition_data.py"        
32
33
script:
    "../scripts/partition_data_causal.py"
42
43
shell:
    "java -jar resources/causalpath/target/causalpath.jar results/{wildcards.transform}/{wildcards.type}/{wildcards.cond}"
52
53
shell:
    "java -jar resources/causalpath/target/causalpath.jar results/correlation/{wildcards.condition}"
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from utils import ensure_dir, generate_data_files, generate_data_files_causal, generate_proteomics_data, generate_parameter_file
import pandas as pd
import os
from itertools import combinations

meta_file = snakemake.params.meta
meta = pd.read_csv(meta_file,sep='\t',index_col=0)
meta = meta.astype(str)

condition_id = snakemake.params.condition
permutations = snakemake.params.permutations
fdr = snakemake.params.fdr
site_match = snakemake.params.site_match
site_effect = snakemake.params.site_effect

phospho_prot_file = snakemake.params.phospho_prot
phospho_prot = pd.read_csv(phospho_prot_file,sep='\t')

correlation, cond = snakemake.output[0].split('/')[1:-1]
causal_relnm = os.path.join(*[os.getcwd(),'results', 'correlation', cond])
ensure_dir(causal_relnm)
kwargs = {condition_id:list(map(str,[cond]))}
print(kwargs)

sub_data, baseline, contrast = generate_data_files_causal(phospho_prot, meta, condition_id, **kwargs)
generate_proteomics_data(sub_data, causal_relnm)
generate_parameter_file(relnm=causal_relnm, test_samps=contrast, control_samps=baseline, value_transformation='correlation', fdr_threshold=fdr, site_match=site_match, site_effect=site_effect, permutations=permutations,ctype='correlation')
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
from utils import ensure_dir, generate_data_files, generate_proteomics_data, generate_parameter_file, generate_rna_data
import pandas as pd
import os
from itertools import combinations

meta_file = snakemake.params.meta
meta = pd.read_csv(meta_file,sep='\t',index_col=0)
meta = meta.astype(str)


phospho_prot_file = snakemake.params.phospho_prot
phospho_prot = pd.read_csv(phospho_prot_file,sep='\t').drop_duplicates()
phospho_prot.ID = phospho_prot.ID.str.upper()

condition_id = snakemake.params.condition
permutations = snakemake.params.permutations
fdr = snakemake.params.fdr
site_match = snakemake.params.site_match
site_effect = snakemake.params.site_effect
ds_thresh = snakemake.params.ds_thresh
rna_file = snakemake.params.rna_file

transform, ctype, cond = snakemake.output[0].split('/')[1:-1]
relnm = os.path.join(*[os.getcwd(),'results',transform, ctype, cond])
ensure_dir(relnm)
kwargs = {condition_id:list(map(str,cond.split('_')))}
sub_data, baseline, contrast = generate_data_files(phospho_prot, meta, condition_id, **kwargs)

generate_proteomics_data(sub_data, relnm)

if rna_file != None:
    print('Incorporating RNAseq into causal relations')

    rna_frame = pd.read_csv(rna_file,sep='\t',index_col=0)
    print('total RNAseq expression matrix of shape {},{}'.format(rna_frame.shape[0],rna_frame.shape[1]))
    print(rna_frame.head())
    sub_rna = rna_frame.reindex(sub_data.columns,axis=1).iloc[:,3:]
    print(sub_rna.head())
    generate_rna_data(sub_rna, relnm)

generate_parameter_file(ds_thresh=ds_thresh, relnm=relnm, test_samps=contrast, control_samps=baseline, ctype=ctype, value_transformation=transform, fdr_threshold=fdr, site_match=site_match, site_effect=site_effect, permutations=permutations)
ShowHide 5 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/PathwayAndDataAnalysis/causalpath/blob/master/README.md
Name: causal-path-pipeline
Version: 1
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: None
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...