Title :
A new clustering-based method for protein structure selection
Author :
Wang, Qingguo ; Shang, Yi ; Xu, Dong
Author_Institution :
Dept. of Comput. Sci., Univ. of Missouri, Columbia, MO
Abstract :
In protein tertiary structure prediction, it is a crucial step to select near-native structures from a large number of candidate structural models. Despite much effort to tackle the problem of protein structure selection, the discerning power of current scoring functions is still unsatisfactory. In this paper, we developed a new clustering-based method for selecting near-native protein structures. Our method consists of three phases: filtering, clustering and cluster reduction, and centroid construction. Given a set of Calpha protein structures, we apply one or multiple existing scoring functions to filter out bad structures. Then, we group the remaining structures into clusters based on pair-wise similarity measured by RMSD. Each cluster is reduced iteratively to remove outliers and bad structures. Finally, we construct a centroid for each cluster by applying multi-dimensional scale techniques. The centroids are the final models. In experiments, we applied our method to a test set of representative proteins and obtained significant improvement over existing methods.
Keywords :
biology computing; pattern clustering; proteins; Calpha protein structures; centroid construction phase; cluster reduction phase; clustering based method; clustering phase; filtering phase; multidimensional scale techniques; near-native structures; pairwise similarity; protein structure selection; protein tertiary structure prediction; scoring function; structural model; Bioinformatics; Crystallography; Drugs; Filtering; Filters; Genomics; Predictive models; Protein engineering; Sequences; Testing;
Conference_Titel :
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-1820-6
Electronic_ISBN :
1098-7576
DOI :
10.1109/IJCNN.2008.4634205