Title of article
Geometry preserving projections algorithm for predicting membrane protein types
Author/Authors
Wang، نويسنده , , Tong and Xia، نويسنده , , Tian and Hu، نويسنده , , Xiao-ming، نويسنده ,
Issue Information
روزنامه با شماره پیاپی سال 2010
Pages
6
From page
208
To page
213
Abstract
Given a new uncharacterized protein sequence, a biologist may want to know whether it is a membrane protein or not? If it is, which membrane protein type it belongs to? Knowing the type of an uncharacterized membrane protein often provides useful clues for finding the biological function of the query protein, developing the computational methods to address these questions can be really helpful. In this study, a sequence encoding scheme based on combing pseudo position-specific score matrix (PsePSSM) and dipeptide composition (DC) is introduced to represent protein samples. However, this sequence encoding scheme would correspond to a very high dimensional feature vector. A dimensionality reduction algorithm, the so-called geometry preserving projections (GPP) is introduced to extract the key features from the high-dimensional space and reduce the original high-dimensional vector to a lower-dimensional one. Finally, the K-nearest neighbor (K-NN) and support vector machine (SVM) classifiers are employed to identify the types of membrane proteins based on their reduced low-dimensional features. Our jackknife and independent dataset test results thus obtained are quite encouraging, which indicate that the above methods are used effectively to deal with this complicated problem of predicting the membrane protein type.
Keywords
Dimensionality reduction , K-nearest neighbor (K-NN) , Bioinformatics
Journal title
Journal of Theoretical Biology
Serial Year
2010
Journal title
Journal of Theoretical Biology
Record number
1539966
Link To Document