DocumentCode :
1649641
Title :
Combination of Multiple Distance Measures for Protein Fold Classification
Author :
Suryanto, Chendra Hadi ; Hino, Hideitsu ; Fukui, Kazuhiro
Author_Institution :
Grad. Sch. of Syst. & Inf. Eng., Univ. of Tsukuba, Tsukuba, Japan
fYear :
2013
Firstpage :
440
Lastpage :
445
Abstract :
In structural biology, measuring the similarity between two protein structures is an essential task. The most common approach is to find the best alignment between two protein backbone structures and use the root mean square deviation (RMSD) of the superimposed alpha-carbon atom coordinates as the distance measurement. Other approaches extract features of the protein structures and the similarity measure is based on the extracted features. However, there is no single best approach, as each has its own advantages and limitations. One intuitive idea is that a better result can be obtained by combining complementary approaches. In this paper, we propose a new approach to protein fold classification, by introducing the concept of large margin nearest neighbor for combining multiple measures of distance between protein structures. We combine the Euclidean distance matrices of 12 features extracted from the amino acid sequence of the protein, the RMSD obtained from the geometrical alignment using Combinatorial Extension, and the canonical angles between the subspaces generated from the synthesized multi-view protein structure images. We demonstrate the effectiveness of the proposed method by classifying 27 fold classes of proteins in the Ding Dubchak dataset.
Keywords :
biology computing; distance measurement; feature extraction; image classification; matrix algebra; mean square error methods; proteins; Ding Dubchak dataset; Euclidean distance matrices; RMSD; amino acid sequence; canonical angles; combinatorial extension; distance measurement; feature extraction; geometrical alignment; large margin nearest neighbor; multiview protein structure images; protein backbone structures; protein fold classification; protein structures similarity; root mean square deviation; similarity measure; structural biology; superimposed alpha-carbon atom coordinates; Accuracy; Feature extraction; Measurement; Proteins; Three-dimensional displays; Training; Vectors; large margin nearest neighbor; metric learning; protein fold classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ACPR), 2013 2nd IAPR Asian Conference on
Conference_Location :
Naha
Type :
conf
DOI :
10.1109/ACPR.2013.139
Filename :
6778357
Link To Document :
بازگشت