charts

Publications

Publication details

Molecular Structure Computation from Multiple Data Sources
Technical Report
Reference:
C. C. Chen. . Published in 1999.
Abstract:

Elucidating the three-dimensional structure of biological molecules such as nucleic acids, proteins, and their macromolecular assemblies is fundamental to understanding their functions. It also poses a computational challenge because of the large number of parameters and the non-linear relationships between them. This dissertation builds on a probabilistic least squares approach adapted for molecular structure estimation, using multiple sources of uncertain data. We demonstrate contributions to both the bioinformatics domain and the high-performance computing aspects of the structure estimation problem. On the domain side, we have developed new algorithms that integrate more complicated structural data than the usual inter-atomic distances and angles, for which sophisticated mathematical theories and powerful computational algorithms are known. These new structural constraints include the moments of the internal distance distribution of the molecule of interest, solvent accessibility and hydrophobic packing, and imperfect secondary structure predictions. In each case, we show the improvement in the quality of the computed structures with the addition of the new type of constraints to a series of sparse short-range distances constituting the baseline data. At the end, we incorporate the techniques we have developed in a sample problem of structure estimation. The results allow us to gauge the relative information content of different constraint types and in turn may be useful in guiding the collection of experimental data. On the computing side, we have significantly sped up the process using a two-pronged strategy, by hierarchical decomposition of the underlying algorithm and by parallel processing. The hierarchical decomposition of the original algorithm results in an irregular, two-tiered pattern of parallelism in the computation. We have developed a new dynamic load balancing approach to exploit this type of parallelism. Our load balancing approach is generalizable to other similarly structured parallel applications, several of which we briefly describe.

Full PDF version available here
Back to Search Results
 
Information last updated: Sat Jun 2 2007
Make Corrections to this Publication
Stanford School of Medicine