Structure Overlap
Given two proteins A and B, structure overlap (also called equivalent positions) is defined as the percentage of the representative atoms in the protein A that are within 3.5Å of the corresponding atoms in the superimposed protein B.
Match Size
Match size (also called absolute similarity) is defined as the number of atoms in the list of equivalence.
Root Mean Square Deviation
Root mean square deviation (RMSD) is the norm of the distance vector between the two sets of coordinates of representative atoms, after superimposition. It is given by
Where, N is the match size, and xiA and xiB are the Cartesian coordinates of representative atoms of structurally equivalent amino acid residues of proteins A and B.
For each alignment produced by CLICK, a Z-score is computed to determine the significance of the alignment. In our case, the Z-scores are an estimate of the likelihood that the alignment is different an alignment made by chance. It is computed as
  • A is the query structure for which similar structures are sought in a database, whose members are the structures {Si}.

  • SOA-Si is the average structure overlap (computed over the representative atoms) on the superimposition of A and Si within cut-off distances of 1 Å, 2 Å, and 3 Å.
  • avg_SObg and std_ SObg are the average and standard deviation of structure overlap within 1 Å, 2 Å, and 3 Å of alignments produced by aligning all members of a background database with one another. Background databases are chosen according to the submitted query.
  • For instance, for a query protein structure, the background database would be a non-redundant set of proteins structures consisting of 1601 chains. Similarly, the DNA, RNA and DNA-protein complex databases consist of 107, 255 and 301 non-redundant structures respectively.
  • When the query structure is a fragmented chain of a structure that consists of N representative points, the same number of
  • The avg_SObg is then computed by an all-against-all alignment of these 30 selected extracts. For the comparisons of whole molecules or for fragments containing more than 80 representative points, the background scores are computed by considering the all-against-all comparisons of structures of 30 randomly chosen whole molecules.

  • For a significant match, Z-score should be above 2.5. The greater the Z-score the more significant is the match.
    Fragment Score
    On applying heuristic measures to maintain chain or fragment continuity, some residue matches are eliminated from consideration, as they do not belong to (or are in the close proximity of) contiguously matched fragments. The fragment score is the ratio of the number of matched positions in the alignment before and after the application of heuristics. This is a handy measure to estimate the extent of similarity between two protein structures especially when they are of dissimilar fold. For structures of similar fold (and size) the fragment score is close to 1 (the maximum value).
    Topology Score
    The topology score is a measure of how similar the topologies of the matched structures are to one another. It is computed based on the directionality of the matched sequence fragments. Topology score varies between a maximum of 1 for topologically identical structures and 0 for those are the topologically completely dissimilar.

  • In each of the 4 examples, the two proteins that are matched consist of 3 different sequence segments.

  • The direction of the arrows that symbolize each segment show the direction from N to C termini.

  • Unless explicitly indicated with black arrows, the segments in the top structure is aligned to one that it is directly above.

  • 4 different cases of sequence alignments implied by CLICK structural alignment are illustrated here a). The sequence alignment maintain topology; topology score = 1. b) The directionality from N To C of the sequence on top is the exact opposite of that to the one in the bottom; topology score = 0. c) Two of the three sequence segments have the same directionality; topology score = 0.66. d) Two of the 3 segments are matched but not in sequential order; topology score = 0.66