OpenEBench TCGA Cancer Driver Genes benchmarking workflow

public public 1yr ago Version: Version 4 0 bookmarks

Description

The workflow takes an input file with Cancer Driver Genes predictions (i.e. the results provided by a participant), computes a set of metrics, and compares them against the data currently stored in OpenEBench within the TCGA community. Two assessment metrics are provided for that predictions. Also, some plots (which are optional) that allow to visualize the performance of the tool are generated. The workflow consists in three standard steps, defined by OpenEBench. The tools needed to run these steps are containerised in three Docker images, whose recipes are available in the TCGA_benchmarking_dockers repository and the images are stored in the INB GitLab container registry . Separated instances are spawned from these images for each step:

  1. Validation : the input file format is checked and, if required, the content of the file is validated (e.g check whether the submitted gene IDs exist)
  2. Metrics Generation : the predictions are compared with the 'Gold Standards' provided by the community, which results in two performance metrics - precision (Positive Predictive Value) and recall(True Positive Rate).
  3. Consolidation : the benchmark itself is performed by merging the tool metrics with the rest of TCGA data. The results are provided in JSON format and SVG format (scatter plot). OpenEBench benchmarking workflow

Code Snippets

104
105
106
"""
python /app/validation.py -i $input_file -r $ref_dir -com $community_id -c $cancer_types -p $tool_name -o validation.json
"""
NextFlow From line 104 of 1.0.8/main.nf
130
131
132
"""
python /app/compute_metrics.py -i $input_file -c $cancer_types -m $gold_standards_dir -p $tool_name -com $community_id -o assessment.json
"""
NextFlow From line 130 of 1.0.8/main.nf
153
154
155
156
157
"""
cp -Lpr $benchmark_data augmented_benchmark_data
python /app/manage_assessment_data.py -b augmented_benchmark_data -p $assessment_out -o aggregation_dir
python /app/merge_data_model_files.py -p $validation_out -m $assessment_out -a aggregation_dir -o data_model_export.json
"""
NextFlow From line 153 of 1.0.8/main.nf
ShowHide 3 more snippets with no or duplicated tags.

Login to post a comment if you would like to share your experience with this workflow.

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Free

Created: 1yr ago
Updated: 1yr ago
Maitainers: public
URL: https://github.com/inab/TCGA_benchmarking_workflow/tree/1.0.8
Name: openebench-tcga-cancer-driver-genes-benchmarking-w
Version: Version 4
Badge:
workflow icon

Insert copied code into your website to add a link to this workflow.

Downloaded: 0
Copyright: Public Domain
License: GNU Affero General Public License v3.0
  • Future updates

Related Workflows

cellranger-snakemake-gke
snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...