Tertris : Automatic and manual 3d protein alignment


Tertris = tertiary + Tetris

Tertris is an experimental prototype of a small, simple, fast, Macintosh Universal program for both automatic and interactive 3d alignment of protein tertiary structures. Tertris requires a G4 or G5 Macintosh running OS X 10.3.9 or 10.4.8+, or an Intel Macintosh running OS X 10.4.8+. Tertris is free for all non-commercial users.

Download: Tertris program and example data files



New work:



Tertris reads two PDB text files in Protein Data Bank version 2.3 format. See the PDB site for more information about this format, and to download additional data files. Correspondences are formed between the backbone Carbon-alpha atoms of each structure. Correspondences can be formed either by sequential atom number (i.e. residue number), or from an external 1d alignment file which can be prepared from the Dali server, a Needleman-Wunsch dynamic alignment, or any other means.

Shown below are two similar but variant structures from human p53 protein:

Both objects are displayed superimposed in different colors (yellow = obj1, green = obj2). Each object is displayed about its center (average) of coordinates, although the objects can be translated with respect to one another. Each coordinate is the orthogonal location in Angstroms of a Carbon-alpha atom in the chain. Just one chain from each file is used at present. The display can be rotated in 3d (by dragging) and zoomed (by dragging with the shift key down) with the mouse. The goal is to superimpose object 2 onto object 1 by using {x, y, z} translation, and {rx, ry, rz} rotation about each of the three axes in the order rx -> ry -> rz. Rotation is performed only on object 2, about its center.

The error function used is the simple root-mean-squared-distance (RMSD) of each pair of corresponding atoms:

error = sqrt( sum( (x2(i) - x1(i))^2 + (y2(i) - y1(i))^2 + (z2(i) - z1(i))^2 ) / ni)
where {x1(i), y1(i), z1(i)} and {x2(i), y2(i), z2(i)} are the coordinates of corresponding atoms in each structure that are attached to amino acid residues occupying the same sequence number. Correspondences can also be formed by reading a 1d alignment file (see below) which specifies the pairs of residues which correspond in each protein. The idea is to make the error as small as possible by adjusting the parameters {x, y, z, rx, ry, rz}. Realignment can be done manually with the sliders, or automatically using either an iterative Monte-Carlo optimization algorithm, or an iterative conjugate axes rotation algorithm.

When the 6 alignment parameters are correct, the error should fall to its minimum value. For identical structures, the error should fall asymptotically to zero. The alignment, parameters, and error are continuously updated in the plot. In addition, a bar meter in the plot shows the relative amount of error. The 5 red bars at left indicate the error in decreasing magnitudes of 10. It is therefore easy to see how the error changes as you adjust the sliders, and where it reaches a minimum for the current parameter and begins rising again.

To try the program, load the two files 2BIN.pdb and 2BIO.pdb either by typing the full pathnames into the file fields and pressing enter after each, or by clicking the file buttons to browse from a dialog. You should load the files in the order object 1 -> object 2. Whenever you reload object 1, object 2 is deleted and must be reloaded, as correspondences between the two must be recalculated.

The plot shows the number of CA atoms in each structure at upper left, with the number of corresponding atoms in parentheses. In order to perform an automated alignment, there must be at least 3 pairs of corresponding atoms in each structure. The 1d alignment window shows a dot plot of the current correspondences between the two proteins:

In this case it is almost a straight line, as the residue sequence numbers are being used. However, there are some residues that are mising from one or the other protein, so the correspondence is not exact.

For 2BIN and 2BIO, the initial alignment is pretty good (1.71 Angstroms), since these are homologous structures. To demonstrate how the automated alignment works, first drastically misalign the structures by moving the x, y, z, rx, ry, and rz sliders similar to the following figure. You can also type values into the fields above the sliders, and press enter.

You can watch the alignment error increase while you adjust the sliders. Note that moving the rotation sliders to their extremes does not necessarily result in the greatest error. Play with the sliders and try to find the orientation with the greatest error. For this grossly misaligned orientation, the error has risen to 21.12 Angstroms.

Once you have misaligned object 2, click the run button. At the default settings, this will perform 1000 iterations of the automatic alignment, redisplaying the two structures every 100 iterations. If the error happens to fall to 0.001, the alignment will be stopped before 1000 iterations. However, the error does not fall to this value, since these are not identical structures. The following figure shows the final re-alignment. The full run takes about 1.5 seconds on my 2 GHz MacBook:

Output is also produced on the system console:

Initial x = 4.75524, y = -3.1578, z = 6.36363, rx = 54.1258, ry = -53.052, rz = 125.874, err = 21.1198
Alignment x = -0.151025, y = -0.423113, z = -0.214116, rx = 1.028, ry = 0.420813, rz = 3.89804, err = 1.36062
Alignment took 1.54504 seconds

The final alignment is pretty good (1.36 Angstroms, better than the original). To convince yourself of that, click the run button again to run another 1000 iterations. You don't have to run all the iterations in one big block. You can set the iterations field to a smaller number, and click the run button to run only a few iterations at a time. As you run more iterations, the error will converge toward its smallest value. At any time, you can move the sliders to reorient object 2, and then begin the automatic alignment again from that point. Clicking the reset button clears the iterations, and resets all parameters to 0.

Once you have aligned the two structures, note some things about their composition:

To further demonstrate the automatic alignment, load 2BIN.pdb into the file 2 position, so that 2BIN is used for both objects. Initially, the structures superimpose exactly. As before, adjust the sliders to drastically misalign the structures. Click the run button and watch the display. Quite quicky, object 2 will be realigned with object 1, and the error will drop asymptotically to zero. In this case the error drops below 0.001 after just over 500 iterations, which takes less than a second.

To demonstrate the use of an alignment file, and the conjugate axis rotation method, load 1LYZ.pdb and 2LZM.pbd. Then click the 'Correspondence using: Alignment file...' pop-up, and choose the file 'Dali.1lyz.2lzm.txt', which is an optimal alignment produced by the Dali web server. Then click the 'Align using: Conjugate gradient rotation' pop-up. You should now see the following:

The 1d alignment window will show the alignment based on the contents of the external file:

The conjugate axis alignment method has some advantages and some disadvantages from the Monte Carlo method:

  1. It is much faster. It can perform an alignment usually in a fraction of a second, with only a dozen or so iterations.
  2. It only performs a rotational alignment. It does not perform a translational alignment, although it does superimpose the centers of both proteins before rotation. The Monte Carlo method is more general --but slower-- and can optimize more parameters simultaneously.
  3. It does not give the rotation values directly (e.g. {rx, ry, rz}), but combines all rotations in a composite rotation matrix which can be used to transform other coordinate points. Therefore, the sliders cannot be used with the conjugate axes method, and have no effect on the position of the second protein. Nor will the resulting angles be reported except via the rotation matrix printed to the console.

To perform an alignment, set the number of iterations to 25 and click 'Run'. The result is presented instantly:

Output is also produced on the system console:

Initial err = 17.4281
Iteration     1:        5.619949
Iteration     2:        5.504115
Iteration     3:        5.485816
Iteration     4:        5.485784
Iteration     5:        5.485784
Iteration     6:        5.485784
Iteration     7:        5.485784
Iteration     8:        5.485784
Iteration     9:        5.485784
Iteration    10:        5.485784
Iteration    11:        5.485784
Iteration    12:        5.485784
Iteration    13:        5.485784
Iteration    14:        5.485784
Iteration    15:        5.485784
Iteration    16:        5.485784
Iteration    17:        5.485784
Iteration    18:        5.485784
Iteration    19:        5.485784
Iteration    20:        5.485784
Iteration    21:        5.485784
Iteration    22:        5.485784
Iteration    23:        5.485784
Iteration    24:        5.485784
Iteration    25:        5.485784
Rotation matrix:
       0.584392        0.763068       -0.276066 
      -0.441732        0.013763       -0.897042 
      -0.680705        0.646171        0.345114 
Alignment took 0.000670999 seconds


İSky Coyote 2007