HPC Molecular Dynamics Simulation Workflow: CWL Version of md_list.py
Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
This repository is part of a series of repositories mirroring the workflow and launchers in https://github.com/bioexcel/biobb_hpc_workflows .
Below the workflow is briefly described, followed by installation instructions, and guidance on running the workflow with the two workflow engines it has been tested on (CWLtool and TOIL).
md_list / md_launch
The
md_list
workflow performs a molecular dynamics simulation on a given structure listed in the YAML properties file.
The
md_launch
workflow will run the
md_list
workflow multiple times (using scatter), passing it structures from a list defined in the YAML properties file.
Getting Started
Requirements
If you are working on your own machine then instructions for installing Git and Docker are given on their websites. You will be able to use either CWLtool (which is the reference implementation of CWL) or toil.
If you are working on HPC then you will need Singularity and Toil rather than Docker and CWLtool. Git and Singularity should already be installed, while the installation of Toil (if this is not installed already) will be covered below.
Version requirements:
- The workflow engine should support CWL standard 1.2 or more recent (versions tested: 1.2.0-dev5 in toil; 1.2 in CWLtool)
Setup
These workflows make use of the BioBB libraries, which are installed using
git submodules
. This requires that you clone this repository, rather than downloading a zip archive (as the git hooks are needed for this to work):
git clone --recurse-submodules https://github.com/douglowe/biobb_hpc_cwl_md_list.git
CWLtool
This can be installed via
conda
, with the command:
conda env create -f install/env_cwlrunner.yml
To install a javascript interpreter (if you do not already have one on your system) use:
conda env create -f install/env_cwlrunner_nodejs.cwl
TOIL
This can be installed using
conda
, with the command:
conda env create -f install/env_toil.yml
To install a javascript interpreter (if you do not already have one on your system) use:
conda env create -f install/env_toil_nodejs.cwl
Running the Workflows
This workflow requires:
-
PDB file describing the molecule of interest (see example
example_input_files/lysozyme.pdb
). -
Configuration file (see example
md_list_input_descriptions.yml
).
CWL
To run the workflow use:
cwl-runner md_launch.cwl md_list_input_descriptions.yml
TOIL
TOIL (at the time of writing, version 5.2.0) does not yet fully support the CWL v1.2.0
standard, so you will need to edit
md_list.cwl
to use:
cwlVersion: v1.2.0-dev5
.
To use the toil engine several environmental variables will need to be set. These will be described in more detail on the TOIL documentation page, below we only highlight the variables we found useful to set.
On all HPC systems it is wise to check the temporary directory variable (
TMPDIR
) - for
TOIL this needs to be on a disk accessible by all compute nodes that will be used.
For Singularity set the variables
CWL_SINGULARITY_CACHE
and
SINGULARITY_CACHEDIR
(again on a disk accessible by all compute nodes).
GridEngine (SGE)
For SGE set:
-
TOIL_GRIDENGINE_PE
(this is the job queue to select) -
TOIL_GRIDENGINE_ARGS
To execute the workflow use:
toil-cwl-runner --enable-dev --batchSystem grid_engine --singularity --defaultCores 1 md_launch.cwl md_list_input_descriptions.yml
This example sets the number of cores used to 1 - we recommend you test your setup
as a serial job before trying to use parallel compute nodes. When changing to a parallel compute job change the
--defaultCores
flag.
SLURM
For Slurm job managers set:
-
TOIL_SLURM_ARGS
, this carries all the required slurm job flags, e.g."--nodes=1 --ntasks-per-node=64 --time=0:10:0 --partition=standard --qos=standard --account=[XXX] --export=ALL"
To execute the workflow use:
toil-cwl-runner --enable-dev --batchSystem slurm --singularity md_launch.cwl md_list_input_descriptions.yml
Copyright & Licensing
This software has been developed in the MMB group at the BSC & IRB ; and in the eScience Lab and Research IT groups at the University of Manchester for the European BioExcel, funded by the European Commission (EU H2020 823830, EU H2020 675728).
-
(c) 2015-2021 Barcelona Supercomputing Center
-
(c) 2015-2021 Institute for Research in Biomedicine
-
(c) 2021 University of Manchester
Licensed under the Apache License 2.0 , see the file LICENSE for details.
Code Snippets
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | baseCommand: editconf hints: DockerRequirement: dockerPull: quay.io/biocontainers/biobb_md:0.1.5--py_0 inputs: input_gro_path: type: File format: edam:format_GROMACS_GRO inputBinding: position: 1 prefix: --input_gro_path output_gro_path: type: string inputBinding: position: 2 prefix: --output_gro_path default: "structure_box.gro" config: type: string? inputBinding: position: 3 prefix: --config |
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | baseCommand: genion hints: DockerRequirement: dockerPull: quay.io/biocontainers/biobb_md:0.1.5--py_0 inputs: input_tpr_path: type: File # TODO: Not yet in EDAM #format: edam:format_GROMACS_TPR inputBinding: position: 1 prefix: --input_tpr_path output_gro_path: type: string inputBinding: position: 2 prefix: --output_gro_path default: "structure_ions.gro" input_top_zip_path: type: File format: edam:format_2333 inputBinding: position: 3 prefix: --input_top_zip_path output_top_zip_path: type: string inputBinding: position: 4 prefix: --output_top_zip_path default: "topology_ions_top.zip" config: type: string? inputBinding: position: 5 prefix: --config |
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 | baseCommand: grompp hints: DockerRequirement: dockerPull: quay.io/biocontainers/biobb_md:0.1.5--py_0 inputs: input_gro_path: label: Path to GRO file doc: | Path to the input GROMACS structure GRO file. Type: str File type: input Accepted formats: gro Example file: https://github.com/bioexcel/biobb_md/raw/master/biobb_md/test/data/gromacs/grompp.gro type: File format: edam:format_GROMACS_GRO inputBinding: position: 1 prefix: --input_gro_path input_top_zip_path: label: Path to TOP and ITP files doc: | Path the input GROMACS topology TOP and ITP files in zip format. Type: str File type: input Accepted formats: zip Example file: https://github.com/bioexcel/biobb_md/raw/master/biobb_md/test/data/gromacs/grompp.zip type: File format: edam:format_2333 inputBinding: position: 2 prefix: --input_top_zip_path output_tpr_path: label: Path to TPR file; Optional doc: | Path to the output portable binary run file TPR. Type: str File type: output Accepted formats: tpr Example file: https://github.com/bioexcel/biobb_md/raw/master/biobb_md/test/reference/gromacs/ref_grompp.tpr type: string inputBinding: position: 3 prefix: --output_tpr_path default: "system.tpr" input_cpt_path: label: Path to the input GROMACS checkpoint file CPT. doc: | Path to the input GROMACS checkpoint file CPT. Optional parameter. Type: str File type: input Accepted formats: cpt type: File? format: edam:format_2333 inputBinding: prefix: --input_cpt_path config: label: Advanced configuration options for GROMACS doc: | Advanced configuration options for GROMACS. This should be passed as a string containing a dict. The possible options to include here are listed under 'properties' in the gromacs documentation: https://biobb-md.readthedocs.io/en/latest/gromacs.html#module-gromacs.grompp type: string? inputBinding: prefix: --config |
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | baseCommand: make_ndx hints: DockerRequirement: dockerPull: quay.io/biocontainers/biobb_md:0.1.5--py_0 inputs: input_structure_path: type: File format: edam:format_GROMACS_GRO inputBinding: position: 1 prefix: --input_structure_path output_ndx_path: type: string inputBinding: position: 2 prefix: --output_ndx_path default: "custom_index.ndx" input_ndx_path: type: File? format: edam:format_2330 inputBinding: prefix: --input_ndx_path config: type: string? inputBinding: prefix: --config |
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | baseCommand: mdrun hints: DockerRequirement: dockerPull: quay.io/biocontainers/biobb_md:0.1.5--py_0 inputs: input_tpr_path: type: File format: edam:format_2333 inputBinding: position: 1 prefix: --input_tpr_path output_trr_path: type: string inputBinding: position: 2 prefix: --output_trr_path default: "trajectory.trr" output_gro_path: type: string inputBinding: position: 3 prefix: --output_gro_path default: "trajectory.gro" output_edr_path: type: string inputBinding: position: 4 prefix: --output_edr_path default: "trajectory.edr" output_log_path: type: string inputBinding: position: 5 prefix: --output_log_path default: "trajectory.log" output_xtc_path: type: string? inputBinding: prefix: --output_xtc_path default: "trajectory.xtc" output_cpt_path: type: string? inputBinding: prefix: --output_cpt_path config: type: string? inputBinding: prefix: --config |
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | baseCommand: pdb2gmx hints: DockerRequirement: dockerPull: quay.io/biocontainers/biobb_md:0.1.5--py_0 inputs: input_pdb_path: type: File format: edam:format_1476 inputBinding: position: 1 prefix: --input_pdb_path output_gro_path: type: string inputBinding: position: 2 prefix: --output_gro_path default: "structure.gro" output_top_zip_path: type: string inputBinding: position: 3 prefix: --output_top_zip_path default: "topology.zip" config: type: string? inputBinding: position: 4 prefix: --config |
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | baseCommand: solvate hints: DockerRequirement: dockerPull: quay.io/biocontainers/biobb_md:0.1.5--py_0 inputs: input_solute_gro_path: type: File format: edam:format_GROMACS_GRO inputBinding: position: 1 prefix: --input_solute_gro_path output_gro_path: type: string inputBinding: position: 2 prefix: --output_gro_path default: "structure_solvated.gro" input_top_zip_path: type: File format: edam:format_2333 inputBinding: position: 3 prefix: --input_top_zip_path output_top_zip_path: type: string inputBinding: position: 4 prefix: --output_top_zip_path default: "topology_solvated.zip" config: type: string? inputBinding: position: 5 prefix: --config |
Support
- Future updates