Utilities for Provis

All of these files contain scripts and helper methods that are used by the core classes of provis.

provis.utils.atminfo module

provis.utils.atminfo.import_atm_mass_info()

Funtion to load dictionary storing atomic mass information by atom type.

dict

dictionary of atomic mass by atom name

provis.utils.atminfo.import_atm_size_info(vw=False)

Funtion to load dictionaries storing atomic radii, color coding and Van-der-Waals radii by atom name.

Coloring from: https://sciencenotes.org/molecule-atom-colors-cpk-colors/

Parameters:
vw: bool, optional

Option to return vanderwaals radius. Default: False.

Returns:
dict

dictionary of atomic radius by atom name

dict

dictionary of color by atom name

dict

return dictionary of vw rdius by atom name

provis.utils.atminfo.import_res_size_info()

Funtion to load dictionaries storing residue radii and color coding by residue name.

Coloring from: http://acces.ens-lyon.fr/biotic/rastop/help/colour.htm

Parameters:
dict

dictionary of radius by residue name

dict

dictionary of color by residue name

provis.utils.charges_utils module

Utility functions for computing hydrogen bonds and electrostatics on protein surface.

provis.utils.charges_utils.compute_angle_deviation(a: numpy.ndarray, b: numpy.ndarray, c: numpy.ndarray, theta: float) float

Computes the absolute angle deviation from theta. a, b, c form the three points that define the angle.

Parameters:
a: np.ndarray,

Coordinate vector of the first point.

b: np.ndarray,

Coordinate vector of the second point.

c: np.ndarray,

Coordinate vector of the third point.

theta: float

Angle to compute deviation with respect to.

Returns:
float

absolute deviation of the angle formed by a, b, c with theta

provis.utils.charges_utils.compute_angle_penalty(angle_deviation: float) float

Compute the angle penalty corresponding to angle of deviation.

Parameters:
angle_deviation: float

Angle of deviation.

Returns:
float

Angle penalty.

provis.utils.charges_utils.compute_hbond_helper(atom_name: str, res: Bio.PDB.Residue.Residue, v: numpy.ndarray) float

Helper function. Computes the hydrogen bond for given atom.

Parameters:
atom_name: str

Name of atom.

res: Residue

Residue corresponding to atom.

v: np.ndarray

Vertices.

Returns:
float

hydrogen bonds of the atom

provis.utils.charges_utils.compute_plane_deviation(a: numpy.ndarray, b: numpy.ndarray, c: numpy.ndarray, d: numpy.ndarray) float

Computes the absolute plane deviation from theta. a, b, c form the three points that define the angle.

Parameters:
a: np.ndarray,

Coordinate vector of the first point.

b: np.ndarray,

Coordinate vector of the second point.

c: np.ndarray,

Coordinate vector of the third point.

theta: float

Angle to compute deviation with respect to.

Returns:
float

absolute deviation of the angle formed by a, b, c with theta

provis.utils.charges_utils.compute_satisfied_CO_HN(atoms)

Compute the list of backbone C=O:H-N that are satisfied. These will be ignored.

Parameters:
atoms: BioPython atoms

list of atoms to be checked.

Returns:
set

set of C=O bonds

set

set of H-N bonds

provis.utils.charges_utils.is_acceptor_atom(atom_name: str, res: Bio.PDB.Residue.Residue) bool

Check if atom is acceptor atom.

Parameters:
atom_name: str

Name of the atom

res: Residue

The corresponding residue

Returns:
bool

True if conditions met

provis.utils.charges_utils.is_polar_hydrogen(atom_name: str, res_name: str) bool

Check if the atom in a given residue has polar hydrogens.

Parameters:
atom_name: str

Name of the atom

res_name: str

Residue name

provis.utils.charges_utils.normalize_electrostatics(in_elec: numpy.ndarray) numpy.ndarray

Normalizing charges on the surface, by clipping to upper and lower thresholds and converting all values to a -1/1 scale.

Parameters:
in_elec: np.ndarray

Input charges for all surface vertices

Returns:
np.ndarray

Normalized surface vertex charges

provis.utils.surface_feat module

https://github.com/bunnech/holoprot/blob/main/holoprot/feat/surface.py is the base for this file. Modifications were made.

Functions to compute features for a patch on the protein surface.

Some of these are borrowed from https://github.com/LPDI-EPFL/masif under the https://github.com/LPDI-EPFL/masif/blob/master/LICENSE license

provis.utils.surface_feat.assign_props_to_new_mesh(new_vertices, old_vertices: numpy.ndarray, old_props: numpy.ndarray, feature_interpolation: bool = True) numpy.ndarray

Assign properties to vertices in modified mesh given the initial mesh. The assignment is carried using a KDTree data structure to query nearest points.

Parameters:
new_vertices: np.ndarray

Vertices on the modified mesh

old_vertices: np.ndarray

Vertices on the original mesh

old_props: np.ndarray

Property values for each vertex on the original mesh

feature_interpolation: bool, (default True)

If set to True interpolates features to new vertices.

Returns:
np.ndarray

Property values for vertices on the modified mesh

provis.utils.surface_feat.compute_charges(vertices: numpy.ndarray, pdb_id: str, path: str) numpy.ndarray

Computes electrostatics for the surface vertices. The function first calls the PDB2PQR executable to prepare the pdb file for electrostatics. Poisson- Boltzmann electrostatics are computed using APSB executable. Multivalue, provided within APSB suite is used to assign charges to each vertex. The charges are further normalized.

Parameters:
vertices: np.ndarray

Surface vertex coordinates

pdb_id: str

PDB ID of the protein

path: str

Path of the pqr file in the form {path}.pqr

Returns:
np.ndarray

Charge values for each vertex

provis.utils.surface_feat.compute_hbonds(vertices: numpy.ndarray, residues: List[Bio.PDB.Residue.Residue], names: List[str]) numpy.ndarray

Compute H-bond (hydrogen-bond) induced charges at every vertex.

Parameters:
vertices: np.ndarray

Vertices of mesh

residues: List[Residue]

List of residues to compute

names: List[str]

List of custom names created by output_pdb_as_xyzrn()

Returns:
np.ndarray

Array of bonds, by vertex

provis.utils.surface_feat.compute_hydrophobicity(names: List[str]) numpy.ndarray

Compute hydrophobicity value for all vertices on the surface. Each surface vertex has a mapping to the corresponding residue from the original protein. This is used to assign a hydrophobicity value to each vertex using the Kyte- Doolittle scale.

Parameters:
names: List[str]

Identifier names for each vertex in the surface

Returns:
np.ndarray

Hydrophobicity values for each surface vertex

provis.utils.surface_feat.compute_shape_index(mesh) numpy.ndarray

Computes shape index for the patches. Shape index characterizes the shape around a point on the surface, computed using the local curvature around each point. These values are derived using Trimesh’s available geometric processing functionality.

Parameters:
mesh: Trimesh

The mesh is constructed using information about vertices and faces.

Returns:
np.ndarray

Shape index for each vertex

provis.utils.surface_feat.compute_surface_features(surface: Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, List[str], Dict[str, str]], pdb_file: str, path: str, mesh=None, fix_mesh: bool = False, return_mesh: bool = False, pdb_id: Optional[str] = None) Tuple[numpy.ndarray]

Computes all patch features.

Parameters:
surface: Surface

Tuple of attributes characterizing the surface. These include vertices, faces, normals to each vertex, areas, fdue identifiers for vertices.

pdb_file: str

PDB File containing the atomic coordinates

path: str

Path to files without extensions. Usually data/tmp/{pdb_id}

mesh: trimesh.Trimesh

The mesh.

fix_mesh: bool, optional

Whether to fix the mesh by collapsing nodes and edges. Default: False.

return_mesh: bool, optional

Whether to return the mesh. Default: False.

pdb_id: str, optional

PDB id of the associated protein. Default: None

Returns:
np.ndarray

Shape index

np.ndarray

Hydrogen bond induced charges

np.ndarray

Hydrophobicity of each surface vertex

np.ndarray

Electrostatics of each surface vertex

provis.utils.surface_utils module

https://github.com/bunnech/holoprot/blob/main/holoprot/utils/surface.py is the base for this file. Modifications were made.

Utilities for preparing and computing features on molecular surfaces.

provis.utils.surface_utils.compute_normal(vertices: numpy.ndarray, faces: numpy.ndarray) numpy.ndarray

Compute normals for the vertices and faces

Parameters:
vertices: np.ndarray

Vertices of the mesh

faces: np.ndarray

Faces of the mesh

Returns:
np.ndarray:

Normals of the mesh

provis.utils.surface_utils.crossp(x: numpy.ndarray, y: numpy.ndarray) numpy.ndarray

Creates the cross product of two numpy arrays

Parameters:
x: np.ndarray

Array 1

y: np.ndarray

Array 2

Returns:
np.ndarray:

(Array 1) x (Array 2)

provis.utils.surface_utils.find_nearest_atom(coords, res_id, new_verts)
provis.utils.surface_utils.fix_trimesh(mesh, resolution: float = 1.0)

Applies a predefined set of fixes to the mesh, and converts it to a specified resolution. These fixes include removing duplicated vertices wihin a certain threshold, removing degenerate triangles, splitting longer edges to a given target length, and collapsing shorter edges.

Parameters:
mesh: trimesh.Trimesh

Mesh

resolution: float

Maximum size of edge in the mesh

Returns:
trimesh.Trimesh:

mesh with all fixes applied

provis.utils.surface_utils.get_surface(out_path: str, density: float, center=[0, 0, 0])

Wrapper function that reads in the output from the MSMS executable to build the protein surface.

out_path: str

path to output (output path from namechecker) directory. Usually data/tmp

density: bool

Need to pass same density as used by the MSMS binary, as the face and vert files have the density included in their names. The variable is needed for loading these files.

center: List[float], optional

Center of the atom cloud. Easily passed from DataHandler._centroid. Default: [0, 0, 0].

Returns:
numpy.ndarray:

vertices

numpy.ndarray:

faces

numpy.ndarray:

vertex normals

list:

list of res_id’s from output_pdb_as_xyzrn()

dict:

dictionary: residues as keys, areas as values

provis.utils.surface_utils.output_pdb_as_xyzrn(pdb_file: str, xyzrn_file: str) None

Converts a .pdb file to a .xyzrn file.

Parameters:
pdb_file: str

path to PDB File to convert (with extension)

xyzrn_file: str

path to the xyzrn File (with extension)

provis.utils.surface_utils.prepare_trimesh(vertices: numpy.ndarray, faces: numpy.ndarray, normals: Optional[numpy.ndarray] = None, apply_fixes: bool = False)

Prepare the mesh surface given vertices and faces. Optionally, compute normals and apply fixes to mesh.

Parameters:
vertices: np.ndarray

Surface vertices

faces: np.ndarray

Triangular faces on the mesh

normals: np.ndarray

Normals for each vertex

apply_fixes: bool

Optional application of fixes to mesh. Check fix_mesh for details on fixes. Default: False,

Returns:
trimesh.Trimesh:

Mesh

provis.utils.surface_utils.read_msms(file_root: str) Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, List[str]]

Read surface constituents from output files generated using MSMS.

file_root: str

Root name for loading .face and .vert files (produced by MSMS). Default location is data/tmp/{pdb_id}s

Parameters:
numpy.ndarray:

vertices

numpy.ndarray:

faces

numpy.ndarray:

vertex normals

list:

list of res_id’s from output_pdb_as_xyzrn()

Module contents

provis.utils.get_residues(pdb_file)
provis.utils.str2bool(v: str) bool

Converts str to bool.

Parameters
  • name – v - String element

  • type – str

Returns

boolean version of v