Jupyter Notebook Protein conformational ensembles generation

public 1yr ago Version: Version 1 0 bookmarks

View Workflow

jupyter-notebook-protein-conformational-ensembles — View Workflow

Help improve this workflow!

This workflow has been published but could be further improved with some additional meta data:

Keyword(s) in categories input, output

You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .

Protein Conformational ensembles generation

Workflow included in the ELIXIR 3D-Bioinfo Implementation Study:

Building on PDBe-KB to chart and characterize the conformation landscape of native proteins

This tutorial aims to illustrate the process of generating protein conformational ensembles from** 3D structures **and analysing its molecular flexibility , step by step, using the BioExcel Building Blocks library (biobb) .

Conformational landscape of native proteins

Proteins are dynamic systems that adopt multiple conformational states , a property essential for many biological processes (e.g. binding other proteins, nucleic acids, small molecule ligands, or switching between functionaly active and inactive states). Characterizing the different conformational states of proteins and the transitions between them is therefore critical for gaining insight into their biological function and can help explain the effects of genetic variants in health and disease and the action of drugs.

Structural biology has become increasingly efficient in sampling the different conformational states of proteins. The PDB has currently archived more than 170,000 individual structures , but over two thirds of these structures represent multiple conformations of the same or related protein, observed in different crystal forms, when interacting with other proteins or other macromolecules, or upon binding small molecule ligands. Charting this conformational diversity across the PDB can therefore be employed to build a useful approximation of the conformational landscape of native proteins.

A number of resources and tools describing and characterizing various often complementary aspects of protein conformational diversity in known structures have been developed, notably by groups in Europe. These tools include algorithms with varying degree of sophistication, for aligning the 3D structures of individual protein chains or domains, of protein assemblies, and evaluating their degree of structural similarity . Using such tools one can align structures pairwise , compute the corresponding similarity matrix , and identify ensembles of structures/conformations with a defined similarity level that tend to recur in different PDB entries, an operation typically performed using clustering methods. Such workflows are at the basis of resources such as CATH, Contemplate, or PDBflex that offer access to conformational ensembles comprised of similar conformations clustered according to various criteria. Other types of tools focus on differences between protein conformations , identifying regions of proteins that undergo large collective displacements in different PDB entries, those that act as hinges or linkers , or regions that are inherently flexible .

To build a meaningful approximation of the conformational landscape of native proteins, the conformational ensembles (and the differences between them), identified on the basis of structural similarity/dissimilarity measures alone, need to be biophysically characterized . This may be approached at two different levels .

At the biological level , it is important to link observed conformational ensembles , to their functional roles by evaluating the correspondence with protein family classifications based on sequence information and functional annotations in public databases e.g. Uniprot, PDKe-Knowledge Base (KB). These links should provide valuable mechanistic insights into how the conformational and dynamic properties of proteins are exploited by evolution to regulate their biological function .
At the physical level one needs to introduce energetic consideration to evaluate the likelihood that the identified conformational ensembles represent conformational states that the protein (or domain under study) samples in isolation. Such evaluation is notoriously challenging and can only be roughly approximated by using computational methods to evaluate the extent to which the observed conformational ensembles can be reproduced by algorithms that simulate the dynamic behavior of protein systems. These algorithms include the computationally expensive classical molecular dynamics (MD) simulations to sample local thermal fluctuations but also faster more approximate methods such as Elastic Network Models and Normal Node Analysis (NMA) to model low energy collective motions . Alternatively, enhanced sampling molecular dynamics can be used to model complex types of conformational changes but at a very high computational cost.

The ELIXIR 3D-Bioinfo Implementation Study Building on PDBe-KB to chart and characterize the conformation landscape of native proteins focuses on:

Mapping the conformational diversity of proteins and their homologs across the PDB.
Characterize the different flexibility properties of protein regions, and link this information to sequence and functional annotation.
Benchmark computational methods that can predict a biophysical description of protein motions.

This notebook is part of the third objective, where a list of computational resources that are able to predict protein flexibility and conformational ensembles have been collected, evaluated, and integrated in reproducible and interoperable workflows using the BioExcel Building Blocks library . Note that the list is not meant to be exhaustive, it is built following the expertise of the implementation study partners.

Code Snippets

import os
import nglview
import simpletraj
import plotly
import plotly.graph_objs as go
import numpy as np
import pandas as pd
import ipywidgets
import json
import zipfile
from IPython.display import display, Markdown

pdbCode = "1ake"
num_frames = 300

Jupyter Notebook Pandas numpy JSON plotly ipywidgets NGLview ipython simpletraj From line 2 of notebooks/biobb_wf_flexdyn.ipynb

# Downloading desired PDB file 
# Import module
from biobb_io.api.pdb import pdb

# Create properties dict and inputs/outputs
downloaded_pdb = pdbCode+'.pdb'

prop = {
    'pdb_code': pdbCode,
    'api_id' : 'mmb'
}

#Create and launch bb
pdb(output_pdb_path=downloaded_pdb,
    properties=prop)

Jupyter Notebook biobb-io biobb_io From line 19 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_structure_utils.utils.extract_model import extract_model

pdb_model = pdbCode+'_model.pdb'

prop = {
    'models': [ 1 ]
}

extract_model(input_structure_path=downloaded_pdb,
              output_structure_path=pdb_model,
              properties=prop)

Jupyter Notebook biobb_structure_utils biobb-structure-utils From line 37 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_structure_utils.utils.extract_chain import extract_chain

monomer = pdbCode+'_monomer.pdb'

prop = {
    'chains': [ 'A' ]
}

extract_chain(input_structure_path=pdb_model,
            output_structure_path=monomer,
            properties=prop)

Jupyter Notebook biobb_structure_utils biobb-structure-utils From line 51 of notebooks/biobb_wf_flexdyn.ipynb

# Show protein
view = nglview.show_structure_file(monomer)
view.add_representation(repr_type='ball+stick', selection='all')
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 65 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_mask import cpptraj_mask

prot_backbone = pdbCode + "_backbone.pdb"

prop = {
    'mask': 'backbone',
    'format': 'pdb'
}

cpptraj_mask(input_top_path=monomer,
            input_traj_path=monomer,
            output_cpptraj_path=prot_backbone,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 73 of notebooks/biobb_wf_flexdyn.ipynb

# Show protein
view = nglview.show_structure_file(prot_backbone)
view.add_representation(repr_type='ball+stick', selection='all')
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 89 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_mask import cpptraj_mask

prot_ca = pdbCode + "_ca.pdb"

prop = {
    'mask': 'c-alpha',
    'format': 'pdb'
}

cpptraj_mask(input_top_path=monomer,
            input_traj_path=monomer,
            output_cpptraj_path=prot_ca,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 97 of notebooks/biobb_wf_flexdyn.ipynb

# Show protein
view = nglview.show_structure_file(prot_ca)
view.add_representation(repr_type='ball+stick', selection='all')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 113 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexdyn.flexdyn.concoord_dist import concoord_dist

concoord_dist_pdb = pdbCode + "_dist.pdb"
concoord_dist_gro = pdbCode + "_dist.gro"
concoord_dist_dat = pdbCode + "_dist.dat"

concoord_lib = os.environ['CONDA_PREFIX']+"/share/concoord/lib"

prop = {
    'retain_hydrogens' : False,
    'cutoff' : 4.0,
    'env_vars_dict' : {
        'CONCOORD_OVERWRITE' : '1',
        'CONCOORDLIB' : concoord_lib
    }
}

concoord_dist(  input_structure_path=monomer,
                output_pdb_path=concoord_dist_pdb,
                output_gro_path=concoord_dist_gro,
                output_dat_path=concoord_dist_dat,
                properties=prop)

Jupyter Notebook biobb_flexdyn concoord biobb-flexdyn From line 122 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexdyn.flexdyn.concoord_disco import concoord_disco

concoord_disco_pdb = pdbCode + "_disco_traj.pdb"
concoord_disco_rmsd = pdbCode + "_disco_rmsd.dat"
concoord_disco_bfactor = pdbCode + "_disco_bfactor.pdb"

concoord_lib = os.environ['CONDA_PREFIX']+"/share/concoord/lib"

prop = {
    'vdw' : 4,
    'num_structs' : num_frames,
    'env_vars_dict' : {
        'CONCOORD_OVERWRITE' : '1',
        'CONCOORDLIB' : concoord_lib
    }
}

concoord_disco(     input_pdb_path=concoord_dist_pdb,
                    input_dat_path=concoord_dist_dat,
                    output_traj_path=concoord_disco_pdb,
                    output_rmsd_path=concoord_disco_rmsd,
                    output_bfactor_path=concoord_disco_bfactor,
                    properties=prop)

Jupyter Notebook biobb_flexdyn concoord biobb-flexdyn From line 147 of notebooks/biobb_wf_flexdyn.ipynb

# Show protein (if num_frames <= 100)

if (num_frames <= 100):
    view = nglview.show_structure_file(concoord_disco_pdb, default_representation=False)
    view.add_representation(repr_type='line', selection='all', color='modelindex')
    view.center()
    view._remote_call('setSize', target='Widget', args=['','600px'])
    view
else:
    #print("Visualizing a multi-model PDB with > 100 frames is highly dangerous. Please use the trajectory visualization below.")
    display(Markdown('<div class="alert alert-info">Visualizing a multi-model PDB with > 100 frames is highly dangerous. Please use the trajectory visualization below.</div>'))

Jupyter Notebook From line 173 of notebooks/biobb_wf_flexdyn.ipynb

view = nglview.show_structure_file(concoord_disco_pdb, default_representation=False)
view.clear_representations()
view.add_representation(repr_type='backbone', selection='all', color='modelindex')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 187 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

concoord_rmsd = pdbCode + "_concoord_rmsd.dat"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}
cpptraj_rms(input_top_path=concoord_dist_pdb,
            input_traj_path=concoord_disco_pdb,
            output_cpptraj_path=concoord_rmsd,
            input_exp_path= monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 196 of notebooks/biobb_wf_flexdyn.ipynb

df = pd.read_csv(concoord_rmsd, header = 0, delimiter='\s+')

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Histogram(x=df['RMSD_00004'], xbins=dict(
    size=0.04), autobinx=False)],
    "layout": go.Layout(title="RMSd variance",
                        xaxis=dict(title = "RMSd (Angstroms)"),
                        yaxis=dict(title = "Population")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 215 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_convert import cpptraj_convert

concoord_trr = pdbCode + "_disco_traj.trr"

prop = {
    'mask' : 'c-alpha',
    'format': 'trr'
}

cpptraj_convert(input_top_path=concoord_dist_pdb,
                input_traj_path=concoord_disco_pdb,
                output_cpptraj_path=concoord_trr,
                properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 232 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(concoord_trr, prot_ca), gui=True)
view.center()
view.add_representation(repr_type='ball+stick', selection='all')
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 248 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexdyn.flexdyn.prody_anm import prody_anm

prody_ensemble = pdbCode + "_prody_anm_traj.pdb"

prop = {
    'selection' : 'backbone',
    'num_structs' : num_frames,
    'rmsd' : 2.0
}

prody_anm(  input_pdb_path=monomer,
            output_pdb_path=prody_ensemble,
            properties=prop)

Jupyter Notebook biobb_flexdyn ProDy biobb-flexdyn From line 257 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

prody_rmsd = pdbCode + "_prody_rmsd.dat"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}
cpptraj_rms(input_top_path=prody_ensemble,
            input_traj_path=prody_ensemble,
            output_cpptraj_path=prody_rmsd,
            input_exp_path= monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 273 of notebooks/biobb_wf_flexdyn.ipynb

df = pd.read_csv(prody_rmsd, header = 0, delimiter='\s+')

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Histogram(x=df['RMSD_00004'], xbins=dict(
    size=0.04), autobinx=False)],
    "layout": go.Layout(title="RMSd variance",
                        xaxis=dict(title = "RMSd (Angstroms)"),
                        yaxis=dict(title = "Population")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 292 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_convert import cpptraj_convert

prody_trr = pdbCode + "_prody_anm_traj.trr"

prop = {
    'mask' : 'c-alpha',
    'format': 'trr'
}

cpptraj_convert(input_top_path=prot_backbone,
                input_traj_path=prody_ensemble,
                output_cpptraj_path=prody_trr,
                properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 309 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(prody_trr, prot_ca), gui=True)
view.center()
view.add_representation(repr_type='ball+stick', selection='all')
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 325 of notebooks/biobb_wf_flexdyn.ipynb

# Running Brownian Dynamics (BD)
# Import module
from biobb_flexserv.flexserv.bd_run import bd_run

# Create properties dict and inputs/outputs

bd_log = pdbCode + '_flexserv_bd_ensemble.log'
bd_crd = pdbCode + '_flexserv_bd_ensemble.mdcrd'

wfreq = 100
time = num_frames * wfreq

prop = {
    'time': time,
    'wfreq': wfreq
}

bd_run( 
     input_pdb_path=prot_ca,
     output_crd_path=bd_crd,
     output_log_path=bd_log,
     properties=prop
)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 334 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

flexserv_bd_rmsd = pdbCode + "_flexserv_bd_rmsd.dat"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}
cpptraj_rms(input_top_path=prot_ca,
            input_traj_path=bd_crd,
            output_cpptraj_path=flexserv_bd_rmsd,
            input_exp_path=monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 360 of notebooks/biobb_wf_flexdyn.ipynb

df = pd.read_csv(flexserv_bd_rmsd, header = 0, delimiter='\s+')

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Histogram(x=df['RMSD_00004'], xbins=dict(
    size=0.04), autobinx=False)],
    "layout": go.Layout(title="RMSd variance",
                        xaxis=dict(title = "RMSd (Angstroms)"),
                        yaxis=dict(title = "Population")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 379 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

flexserv_bd_rmsd = pdbCode + "_flexserv_bd_rmsd.dat" 
flexserv_bd_traj_fitted = pdbCode + "_flexserv_bd_traj_fitted.trr"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}
cpptraj_rms(input_top_path=prot_ca,
            input_traj_path=bd_crd,
            output_cpptraj_path=flexserv_bd_rmsd,
            output_traj_path=flexserv_bd_traj_fitted,
            input_exp_path= monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 396 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(flexserv_bd_traj_fitted, prot_ca), gui=True)
view.add_representation(repr_type='ball+stick', selection='all')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 417 of notebooks/biobb_wf_flexdyn.ipynb

# Running Discrete Molecular Dynamics (DMD)
# Import module
from biobb_flexserv.flexserv.dmd_run import dmd_run

# Create properties dict and inputs/outputs

dmd_log = pdbCode + '_flexserv_dmd_ensemble.log'
dmd_crd = pdbCode + '_flexserv_dmd_ensemble.mdcrd'

prop = {
    'frames': num_frames
}
dmd_run( 
     input_pdb_path=prot_ca,
     output_crd_path=dmd_crd,
     output_log_path=dmd_log,
     properties=prop
)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 426 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

flexserv_dmd_rmsd = pdbCode + "_flexserv_dmd_rmsd.dat"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}

cpptraj_rms(input_top_path=prot_ca,
            input_traj_path=dmd_crd,
            output_cpptraj_path=flexserv_dmd_rmsd,
            input_exp_path= monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 447 of notebooks/biobb_wf_flexdyn.ipynb

df = pd.read_csv(flexserv_dmd_rmsd, header = 0, delimiter='\s+')

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Histogram(x=df['RMSD_00004'], xbins=dict(
    size=0.04), autobinx=False)],
    "layout": go.Layout(title="RMSd variance",
                        xaxis=dict(title = "RMSd (Angstroms)"),
                        yaxis=dict(title = "Population")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 467 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

flexserv_dmd_rmsd = pdbCode + "_flexserv_dmd_rmsd.dat"
flexserv_dmd_traj_fitted = pdbCode + "_flexserv_dmd_traj_fitted.trr"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}

cpptraj_rms(input_top_path=prot_ca,
            input_traj_path=dmd_crd,
            output_cpptraj_path=flexserv_dmd_rmsd,
            output_traj_path=flexserv_dmd_traj_fitted,
            input_exp_path= monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 484 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(flexserv_dmd_traj_fitted, prot_ca), gui=True)
view.add_representation(repr_type='ball+stick', selection='all')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 506 of notebooks/biobb_wf_flexdyn.ipynb

# Running Normal Mode Analysis (NMA)
# Import module
from biobb_flexserv.flexserv.nma_run import nma_run

# Create properties dict and inputs/outputs

nma_log = pdbCode + '_flexserv_nma_ensemble.log'
nma_crd = pdbCode + '_flexserv_nma_ensemble.mdcrd'

prop = {
    'frames' : num_frames
}

nma_run( 
     input_pdb_path=prot_ca,
     output_crd_path=nma_crd,
     output_log_path=nma_log,
     properties=prop
)

Jupyter Notebook biobb-flexserv biobb_flexserv NMA From line 515 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

flexserv_nma_rmsd = pdbCode + "_flexserv_nma_rmsd.dat"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}

cpptraj_rms(input_top_path=prot_ca,
            input_traj_path=nma_crd,
            output_cpptraj_path=flexserv_nma_rmsd,
            input_exp_path= monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 537 of notebooks/biobb_wf_flexdyn.ipynb

df = pd.read_csv(flexserv_nma_rmsd, header = 0, delimiter='\s+')

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Histogram(x=df['RMSD_00004'], xbins=dict(
    size=0.04), autobinx=False)],
    "layout": go.Layout(title="RMSd variance",
                        xaxis=dict(title = "RMSd (Angstroms)"),
                        yaxis=dict(title = "Population")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 557 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_convert import cpptraj_convert

nma_trr = pdbCode + '_flexserv_nma_ensemble.trr'

prop = {
    'mask' : 'c-alpha',
    'format': 'trr'
}

cpptraj_convert(input_top_path=prot_ca,
                input_traj_path=nma_crd,
                output_cpptraj_path=nma_trr,
                properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 574 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(nma_trr, prot_ca), gui=True)
view.add_representation(repr_type='ball+stick', selection='all')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 590 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexdyn.flexdyn.nolb_nma import nolb_nma

nolb_pdb = pdbCode + '_nolb_ensemble.pdb'

prop = {
    'num_structs' : num_frames,
    'rmsd' : 4
}

nolb_nma(   input_pdb_path=prot_ca,
        output_pdb_path=nolb_pdb,
        properties=prop)

Jupyter Notebook biobb_flexdyn biobb-flexdyn nolb From line 599 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

nolb_rmsd = pdbCode + "_nolb_rmsd.dat"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}

cpptraj_rms(input_top_path=prot_ca,
            input_traj_path=nolb_pdb,
            output_cpptraj_path=nolb_rmsd,
            input_exp_path= monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 614 of notebooks/biobb_wf_flexdyn.ipynb

df = pd.read_csv(nolb_rmsd, header = 0, delimiter='\s+')

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Histogram(x=df['RMSD_00004'], xbins=dict(
    size=0.04), autobinx=False)],
    "layout": go.Layout(title="RMSd variance",
                        xaxis=dict(title = "RMSd (Angstroms)"),
                        yaxis=dict(title = "Population")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 634 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_convert import cpptraj_convert

nolb_trr = pdbCode + '_nolb_ensemble.trr'

prop = {
    'mask' : 'c-alpha',
    'format': 'trr'
}

cpptraj_convert(input_top_path=prot_ca,
                input_traj_path=nolb_pdb,
                output_cpptraj_path=nolb_trr,
                properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 651 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(nolb_trr, prot_ca), gui=True)
view.clear_representations()
view.add_representation(repr_type='ball+stick', selection='all')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 667 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexdyn.flexdyn.imod_imode import imod_imode

imode_evecs = pdbCode + '_imode_evecs.dat'

prop = {
    'cg' : 2
}

imod_imode(  input_pdb_path=monomer,
        output_dat_path=imode_evecs,
        properties=prop)

Jupyter Notebook biobb_flexdyn biobb-flexdyn IMOD From line 677 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexdyn.flexdyn.imod_imc import imod_imc

imc_pdb = pdbCode + '_imc.pdb'

prop = {
    'num_structs': num_frames,
    'num_modes': 10,
    'amplitude': 6.0
}

imod_imc(   input_pdb_path=monomer,
            input_dat_path=imode_evecs,
            output_traj_path=imc_pdb,
            properties=prop)

Jupyter Notebook biobb_flexdyn biobb-flexdyn IMOD From line 691 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

imods_rmsd = pdbCode + "_imods_rmsd.dat"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}

cpptraj_rms(input_top_path=imc_pdb,
            input_traj_path=imc_pdb,
            output_cpptraj_path=imods_rmsd,
            input_exp_path= monomer,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 708 of notebooks/biobb_wf_flexdyn.ipynb

df = pd.read_csv(imods_rmsd, header = 0, delimiter='\s+')

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Histogram(x=df['RMSD_00004'], xbins=dict(
    size=0.04), autobinx=False)],
    "layout": go.Layout(title="RMSd variance",
                        xaxis=dict(title = "RMSd (Angstroms)"),
                        yaxis=dict(title = "Population")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 728 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_convert import cpptraj_convert

imods_trr = pdbCode + '_imods_ensemble.trr'

prop = {
    'mask' : 'c-alpha',
    'format': 'trr'
}

cpptraj_convert(input_top_path=imc_pdb,
                input_traj_path=imc_pdb,
                output_cpptraj_path=imods_trr,
                properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 745 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(imods_trr, prot_ca), gui=True)
view.clear_representations()
view.add_representation(repr_type='ball+stick', selection='all')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 761 of notebooks/biobb_wf_flexdyn.ipynb

traj_zip = pdbCode + "_concat_traj.zip"

with zipfile.ZipFile(traj_zip, 'w') as myzip:
    myzip.write(concoord_trr)
    myzip.write(prody_trr)
    myzip.write(imods_trr)
    #myzip.write(flexserv_bd_traj_fitted)
    myzip.write(flexserv_dmd_traj_fitted)    
    myzip.write(nma_trr)

Jupyter Notebook From line 771 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_gromacs.gromacs.trjcat import trjcat

concat_trr = pdbCode + "_concat_traj.trr"

trjcat(input_trj_zip_path=traj_zip,
       output_trj_path=concat_trr)

Jupyter Notebook biobb_gromacs biobb-gromacs From line 783 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_gromacs.gromacs.make_ndx import make_ndx

gmx_index_file = pdbCode + "_gmx_ndx.ndx"

prop = { 'selection': 3 }

make_ndx(input_structure_path=prot_ca,
         output_ndx_path=gmx_index_file,
         properties=prop)

Jupyter Notebook biobb_gromacs biobb-gromacs From line 792 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.gromacs.gmx_cluster import gmx_cluster

cluster_concat_pdb = pdbCode + "_concat_cluster.pdb"

prop = {
    'fit_selection': 'System',
    'output_selection': 'System',
    'method': 'linkage',
    'cutoff': 0.12 # (0.12 nm = 1.2 Angstroms) 
    #'cutoff': 0.15 # (0.15 nm = 1.5 Angstroms) 
}

gmx_cluster(input_structure_path=prot_ca,
            input_traj_path=concat_trr,
            input_index_path=gmx_index_file,
            output_pdb_path=cluster_concat_pdb,
            properties=prop)

Jupyter Notebook biobb_analysis Gromacs biobb-analysis From line 804 of notebooks/biobb_wf_flexdyn.ipynb

# Show protein
view = nglview.show_structure_file(cluster_concat_pdb, default_representation=False)
view.add_representation(repr_type='tube', selection='all', color='modelindex')
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 824 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_rms import cpptraj_rms

meta_traj_rmsd = pdbCode + "_meta_traj_rmsd.dat" 
meta_traj_fitted = pdbCode + "_meta_traj_fitted.crd"

prop = {
    'start': 1,
    'end': -1,
    'steps': 1,
    'mask': 'c-alpha',
    'reference': 'experimental'
}
cpptraj_rms(input_top_path=prot_ca,
            input_traj_path=cluster_concat_pdb,
            output_cpptraj_path=meta_traj_rmsd,
            output_traj_path=meta_traj_fitted,
            input_exp_path= prot_ca,
            properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 832 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexserv.pcasuite.pcz_zip import pcz_zip

concat_pcz = pdbCode + '_concat_ensemble.pcz'
concat_pcz_gaussian = pdbCode + '_concat_ensemble_gaussian.pcz'

# Classical RMSd fitting
prop = {
    'variance': 90,
    'neigenv' : 10
}

pcz_zip( input_pdb_path=prot_ca,
        input_crd_path=meta_traj_fitted,
        output_pcz_path=concat_pcz,
        properties=prop)

# Gaussian (weighted) RMSd fitting
prop = {
    'variance': 90,
    'neigenv' : 10,
    'gauss_rmsd' : True
}

pcz_zip( input_pdb_path=prot_ca,
        input_crd_path=meta_traj_fitted,
        output_pcz_path=concat_pcz_gaussian,
        properties=prop)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 853 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexserv.pcasuite.pcz_info import pcz_info

pcz_report = pdbCode + "_pcz_report.json"

pcz_info( 
    input_pcz_path=concat_pcz,
    output_json_path=pcz_report
)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 883 of notebooks/biobb_wf_flexdyn.ipynb

with open(pcz_report, 'r') as f:
  pcz_info = json.load(f)
print(json.dumps(pcz_info, indent=2))

Jupyter Notebook From line 894 of notebooks/biobb_wf_flexdyn.ipynb

# Plotting Variance Profile
y = np.array(pcz_info['Eigen_Values'])
x = list(range(1,len(y)+1))

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Scatter(x=x, y=y)],
    "layout": go.Layout(title="Variance Profile",
                        xaxis=dict(title = "Principal Component"),
                        yaxis=dict(title = "Variance")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 900 of notebooks/biobb_wf_flexdyn.ipynb

# Plotting Dimensionality/quality profile
y = np.array(pcz_info['Eigen_Values_dimensionality_vs_total'])
x = list(range(1,len(y)+1))

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Scatter(x=x, y=y)],
    "layout": go.Layout(title="Dimensionality/Quality profile",
                        xaxis=dict(title = "Principal Component"),
                        yaxis=dict(title = "Accumulated Quality (%)")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 918 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexserv.pcasuite.pcz_evecs import pcz_evecs

pcz_evecs_report = pdbCode + "_pcz_evecs.json"

prop = {
    'eigenvector': 1
}

pcz_evecs( 
        input_pcz_path=concat_pcz,
        output_json_path=pcz_evecs_report,
        properties=prop)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 936 of notebooks/biobb_wf_flexdyn.ipynb

with open(pcz_evecs_report, 'r') as f:
  pcz_evecs_report_json = json.load(f)
print(json.dumps(pcz_evecs_report_json, indent=2))

Jupyter Notebook From line 951 of notebooks/biobb_wf_flexdyn.ipynb

# Plotting Eigen Value Residue Components
y = np.array(pcz_evecs_report_json['projs'])
x = list(range(1,len(y)+1))

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Bar(x=x, y=y)],
    "layout": go.Layout(title="Eigen Value Residue Components",
                        xaxis=dict(title = "Residue Number"),
                        yaxis=dict(title = "\u00C5")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 957 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexserv.pcasuite.pcz_animate import pcz_animate

proj1 = pdbCode + "_pcz_proj1.crd"

prop = {
    'eigenvector': 1  # Try changing the eigenvector number!
}

pcz_animate( input_pcz_path=concat_pcz,
        output_crd_path=proj1,
        properties=prop)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 975 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_analysis.ambertools.cpptraj_convert import cpptraj_convert

proj1_dcd = pdbCode + '_pcz_proj1.dcd'

prop = {
    'format': 'dcd'
}

cpptraj_convert(input_top_path=prot_ca,
                input_traj_path=proj1,
                output_cpptraj_path=proj1_dcd,
                properties=prop)

Jupyter Notebook biobb_analysis biobb-analysis CPPTRAJ From line 989 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(proj1_dcd, prot_ca), gui=True)
#view.add_representation(repr_type='spacefill', radius=0.7, selection='all')
view.add_representation(repr_type='surface', selection='all')
view.center()
view._remote_call('setSize', target='Widget', args=['','600px'])
view

Jupyter Notebook From line 1004 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexserv.pcasuite.pcz_bfactor import pcz_bfactor

bfactor_all_dat = pdbCode + "_bfactor_all.dat"
bfactor_all_pdb = pdbCode + "_bfactor_all.pdb"

prop = {
    'eigenvector': 0,
    'pdb': True
}

pcz_bfactor( 
    input_pcz_path=concat_pcz,
    output_dat_path=bfactor_all_dat,
    output_pdb_path=bfactor_all_pdb,
    properties=prop
)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 1014 of notebooks/biobb_wf_flexdyn.ipynb

# Plotting the B-factors x Residue x PCA mode
y = np.loadtxt(bfactor_all_dat)
x = list(range(1,len(y)+1))

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Scatter(x=x, y=y)],
    "layout": go.Layout(title="Bfactor x Residue x PCA Modes (All)",
                        xaxis=dict(title = "Residue Number"),
                        yaxis=dict(title = "Bfactor (" + '\u00C5' +'\u00B2' + ")")
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 1033 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view = nglview.show_simpletraj(nglview.SimpletrajTrajectory(proj1_dcd, bfactor_all_pdb))
view.add_representation(repr_type='spacefill', selection='all', colorScheme='bfactor')
view.add_representation(repr_type='tube', radius='0.4', selection='all', color='white')
view._remote_call('setSize', target='Widget', args=['','600px'])

stop = False
def loop(view):
    import time
    def do():
        while True and not stop:
            if view.frame == view.max_frame:
                direction = -1
            if view.frame == 0:
                direction = 1
            view.frame = view.frame + direction
            time.sleep(0.2)
    view._run_on_another_thread(do)

view.on_displayed(loop)

view._iplayer.children[0].disabled = True
view._iplayer.children[1].disabled = True

view

Jupyter Notebook From line 1051 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexserv.pcasuite.pcz_hinges import pcz_hinges

hinges_bfactor_report = pdbCode + "_hinges_bfactor_report.json"
hinges_dyndom_report = pdbCode + "_hinges_dyndom_report.json"
hinges_fcte_report = pdbCode + "_hinges_fcte_report.json"

bfactor_method = "Bfactor_slope"
dyndom_method = "Dynamic_domain"
fcte_method = "Force_constant"

bfactor_prop = {
    'eigenvector': 0, # 0 = All modes
    'method': bfactor_method
}

dyndom_prop = {
    'eigenvector': 0, # 0 = All modes
    'method': dyndom_method
}

fcte_prop = {
    'eigenvector': 0, # 0 = All modes
    'method': fcte_method
}

pcz_hinges( 
        input_pcz_path=concat_pcz_gaussian,
        output_json_path=hinges_bfactor_report,
        properties=bfactor_prop
)

pcz_hinges( 
        input_pcz_path=concat_pcz_gaussian,
        output_json_path=hinges_dyndom_report,
        properties=dyndom_prop
)

pcz_hinges( 
        input_pcz_path=concat_pcz_gaussian,
        output_json_path=hinges_fcte_report,
        properties=fcte_prop
)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 1079 of notebooks/biobb_wf_flexdyn.ipynb

with open(hinges_bfactor_report, 'r') as f:
  hinges_bfactor = json.load(f)
print(json.dumps(hinges_bfactor, indent=2))

with open(hinges_dyndom_report, 'r') as f:
  hinges_dyndom = json.load(f)
print(json.dumps(hinges_dyndom, indent=2))

with open(hinges_fcte_report, 'r') as f:
  hinges_fcte = json.load(f)
print(json.dumps(hinges_fcte, indent=2))

Jupyter Notebook From line 1124 of notebooks/biobb_wf_flexdyn.ipynb

# Show trajectory
view1 = nglview.show_simpletraj(nglview.SimpletrajTrajectory(proj1_dcd, bfactor_all_pdb), gui=True)
view1.add_representation(repr_type='surface', selection=hinges_dyndom["clusters"][0]["residues"], color='red')
view1.add_representation(repr_type='surface', selection=hinges_dyndom["clusters"][1]["residues"], color='green')
#view1.add_representation(repr_type='surface', selection=hinges_dyndom["clusters"][2]["residues"], color='yellow')
#view1.add_representation(repr_type='surface', selection=hinges_dyndom["hinge_residues"], color='red')
view1._remote_call('setSize', target='Widget', args=['350px','350px'])
view1
view2 = nglview.show_simpletraj(nglview.SimpletrajTrajectory(proj1_dcd, bfactor_all_pdb), gui=True)
view2.add_representation(repr_type='surface', selection=hinges_bfactor["hinge_residues"], color='red')
view2._remote_call('setSize', target='Widget', args=['350px','350px'])
view2
view3 = nglview.show_simpletraj(nglview.SimpletrajTrajectory(proj1_dcd, bfactor_all_pdb), gui=True)
view3.add_representation(repr_type='surface', selection=str(hinges_fcte["hinge_residues"]), color='red')    
view3._remote_call('setSize', target='Widget', args=['350px','350px'])
view3
ipywidgets.HBox([view1, view2, view3])

Jupyter Notebook From line 1138 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexserv.pcasuite.pcz_stiffness import pcz_stiffness

stiffness_report = pdbCode + "_pcz_stiffness.json"

prop = {
    'eigenvector': 0 # 0 = All modes
}

pcz_stiffness( 
        input_pcz_path=concat_pcz,
        output_json_path=stiffness_report,
        properties=prop
)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 1158 of notebooks/biobb_wf_flexdyn.ipynb

with open(stiffness_report, 'r') as f:
  pcz_stiffness_report = json.load(f)
print(json.dumps(pcz_stiffness_report, indent=2))

Jupyter Notebook From line 1174 of notebooks/biobb_wf_flexdyn.ipynb

y = np.array(pcz_stiffness_report['stiffness'])
x = list(range(1,len(y)))

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Heatmap(x=x, y=x, z=y, type = 'heatmap', colorscale = 'reds')],
    "layout": go.Layout(title="Apparent Stiffness",
                        xaxis=dict(title = "Residue Number"),
                        yaxis=dict(title = "Residue Number"),
                        width=800, height=800
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 1180 of notebooks/biobb_wf_flexdyn.ipynb

y = np.array(pcz_stiffness_report['stiffness_log'])
x = list(range(1,len(y)))

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Heatmap(x=x, y=x, z=y, type = 'heatmap', colorscale = 'reds')],
    "layout": go.Layout(title="Apparent Stiffness (Logarithmic Scale)",
                        xaxis=dict(title = "Residue Number"),
                        yaxis=dict(title = "Residue Number"),
                        width=800, height=800
                       )
}

plotly.offline.iplot(fig)

Jupyter Notebook From line 1198 of notebooks/biobb_wf_flexdyn.ipynb

from biobb_flexserv.pcasuite.pcz_collectivity import pcz_collectivity

pcz_collectivity_report = pdbCode + "_pcz_collectivity.json"

prop = {
    'eigenvector':0 # 0 = All modes
}

pcz_collectivity( 
    input_pcz_path=concat_pcz,
    output_json_path=pcz_collectivity_report,
    properties=prop
)

Jupyter Notebook biobb-flexserv biobb_flexserv From line 1216 of notebooks/biobb_wf_flexdyn.ipynb

with open(pcz_collectivity_report, 'r') as f:
  pcz_collectivity_report_json = json.load(f)
print(json.dumps(pcz_collectivity_report_json, indent=2))

Jupyter Notebook From line 1232 of notebooks/biobb_wf_flexdyn.ipynb

z = np.array(pcz_collectivity_report_json['collectivity'])
x = list(range(1,len(z)+1))
x = ["PC" + str(pc) for pc in x]

y = [""]

plotly.offline.init_notebook_mode(connected=True)

fig = {
    "data": [go.Heatmap(x=x, y=y, z=[z], type = 'heatmap', colorscale = 'reds')],
    "layout": go.Layout(title="Collectivity Index",
                        xaxis=dict(title = "Principal Component"),
                        yaxis=dict(title = "Collectivity"),
                        width=1000, height=300
                       )
}

plotly.offline.iplot(fig)