When N = 3 pairs, you can get the 3d fit down to a very small RMSD value:

Even with 5 pairs, the best RMSD is below 1 Angstrom:

But, with 10 pairs, the 3d fit starts to become poorer:

With 25 pairs, worse still. Nevertheless, notice that there is structure in the dot plot, and the 'best' synthetic 1d alignment tends to show clustering in nearby atoms. Some of these clusters even fall near the Dali alignment:

50 pairs:

100 pairs:

In this case, even the synthetic 1d alignment is better than Needleman-Wunsch for a 3d fit, although it uses slightly fewer pairs. This work is promising, but there appear to be two problems:

- Entropy. Based on the NW and Dali alignments, one might expect a good 1d/3d alignment
to consist of short line segments of nearby atoms, with gaps in between. These segments could be from bottom left to top right
for subsequences going in the same direction, or from top left to bottom right for antiparallel
subsequences. Purely random alignments are unlikely to converge on this
kind of coherent structure. What I might try is to start with a linear sequential alignment, and perturb it away
from linearity --rather than the other way around-- to see if it converges.

- Freezing. Since there can only be one dot per row and column, alignments with many pairs tend to get stuck in a particular configuration with nowhere to go. I need to find a way to 'unfreeze' these configurations away from local minima so that the error can continue to drop.

I've solved these problems, and now when aligning on 100 atom pairs, the best I have done is an RMSD of 2.88 Angstroms. The 1d alignment shows significant clustering:

Another 3d alignment at about the same RMSD value shows a similar 1d pattern:

Other, slightly higher 3d values show slightly different --but similar to one another-- 1d patterns:

However, a better 3d value shows a very different 1d pattern:

For 77 atom pairs, the best I have done was 2.54 A. The Dali result in the HS paper was 4.2 A, although the 1d alignment was somewhat different:

For 42 atom pairs, the best I have done was 1.86 A. The Dali result in the HS paper was 2.2 A:

Although I am now getting good alignments, the biological significance of the results is questionable:

- Why do 3d alignments with similar RMSD values result from very different 1d alignments? Do these represent equally
likely conformational geometries of actual molecules (e.g. enzymes, prions)?

- Although there is clustering in the 1d alignments, clusters are not generally linear, contiguous, subsequences of atoms, whereas real proteins appear to have linear homologous segments. Perhaps I should try constraining the random 1d alignments to only allow contiguous linear subsequences of length N ≥ 4, as suggested by the H&S paper.

İSky Coyote 2007