Server For Computing/Predicting DEPTH, Cavity Sizes, Ligand Binding Sites and pKa

DEPTH Algorithm

Depth is the distance of a atom/residue to its closest molecule of bulk solvent.

Step 1: Solvating the Protein Molecule

The protein molecule of interest is placed at the center of a pre-equilibrated box of solvent (water). Full atomic water model SPC216 (generated using GROMACS [Hess et at, 2008] genbox with spc216.gro structural file) is used here.
The water molecules that clash with atoms of the protein (within 2.6Å of protein atoms) are removed from the box.

Step 2: Solvating the Protein Molecule

Other than the clashing water molecules, non-bulk waters are also removed from the box.

Non-bulk waters are those that are trapped in cavities (Figure 1) and isolated from the bulk solvent.

Isolated water are detected by inspecting the number of water molecules in its immediate neighborhood.

A water molecule is considered non-bulk if there are less than a specified minimum number of neighborhood waters (default value = 4) within a spherical volume of a specified solvent neighborhood radius(default value 4.2Å, Figure 2).
The removal of a cavity water causes its immediately neighboring waters to lose one neighborhood water molecule. For this reason, the check and removal of non-bulk waters is iterated until there is no further removal of water from the solvent box. For practical reasons, users are advised to vary the minimum number of neighborhood waters in the range 1 - 5. Checking for larger number of neighborhood waters often results in the removal of all water molecules.

Responsive image Responsive image

Step 3: Sampling Solvent Configurations

The bulk water surrounding a protein is freely diffusing. To mimic this dynamics of bulk water, the protein is solvated repeatedly, each time in a different orientation. New orientations are generated by rotating the protein by a random angle about an axis passing through its center of mass, and translating it along the X-axis to a random distance < 2.8 Å (the average distance between neighboring water molecule in the box). Each solvation of the protein is considered to represent a snapshot of the dynamics of bulk-water. With sufficient number of solvations, water molecules can explore all regions accessible to bulk solvent, hence mimicking bulk-water dynamics (Figure 3).
At each solvation iteration, the value of atom/residue depth is computed as the distance between the atom/residue to the closest molecule of bulk water. Depth is finally reported as the average depth over all solvation iterations. The user can specify the 'number of solvation cycles '(default = 25).
Note: Run time scales linearly with number of solvation cycles.

Responsive image

Binding Cavity Prediction

Step 1: Constructing Binding Cavity Probability Tables for Amino-Acids

A binding cavity is a protein sub-structure of conserved geometrical and chemical properties complimentary to its bound ligand. Using a training-set of ligand bound high resolution crystal structures of proteins, residue depth and solvent-accessible area values were computed for all residues. The probability of individual amino acids to form part of the binding cavity is parametrized by the residue depth, accessible area value pairs (Figure 4)

Step 2: Assigning Probability Values onto Protein of Interest

For an query protein, solvent accessibility and depth are computed for all residues. Residues are assigned binding cavity probability values corresponding to solvent accessibility, residue depth value pairs. (Figure 5). If evolutionary information is used in making the predictions, 3 iterations of PSI-BLAST is used to create a multiple sequence alignment of homologues of the query (e-value cut-off of 0.00001). From this multiple sequence alignment a entropy value is computed and the probability value of the residues are then an average of the Depth/ASA prediction probability values and the entropy probability values.

Responsive image

Step 3:Grouping Cavity Residues

  1. All residues with probability values above a user definable cavity prediction probability threshold are selected as binding cavity residues.
  2. A 6.2Å sphere is built around each of these selected residue.
  3. Starting from the residue with highest probability value, all other selected residues within this sphere are merged into the same binding cavity.
  4. The process is repeated until no further merger occurs.
  5. Finally, a 3.6Å sphere is built around every residue within each cavity.
  6. All solvent-accessible residue (side-chain accessibility > 30%) that are a part of these spheres are also grouped into the binding cavities.