DocumentCode :
2870212
Title :
A novel ensemble approach to prediction of protein subcellular location
Author :
Yue-Hui, Chen ; Li-Yuan, Liu ; Bing-Xian, Ma
Author_Institution :
Sch. of Inf. Sci. & Eng., Univ. of Jinan, Jinan, China
Volume :
9
fYear :
2010
fDate :
22-24 Oct. 2010
Abstract :
Much attention has been paid to the technically research and practical application of prediction of protein subcellular location since a great number of previous works by researchers proved the close relationship between protein function and its location as well as human genome project successfully completed over last decades. With rapid progress of computer´s calculating speed, computational intelligence method dominates in the prediction of protein subcellular location. In our study, we chose pseudo amino acid (PseAA) model to extract features from protein primitive sequence as the input of classifier. Based on evolutionary fuzzy k-nearest neighbor algorithm (EFKNN), we trained and established six base classifiers with adopting totally different k-values that play an important role in the procedure of training and classifying. In accordance with the outputs of the six base classifiers, a novel ensemble approach named accumulative vote quantity (AVQ) to integrating each output is proposed. For the sake of verifying the effectiveness of our proposed method, we adopted benchmark dataset constructed by Jennifer L. Gardy and Fiona S.L. Brinkman in 2006 as training set whose five subcellular locations were taken from gram-negative bacterial. Simulating test by jackknife test results on dataset is 80.0%, which indicates that our proposed method can be considered to be a powerful prediction tool, or, to some extent, give complementary part to present prediction method.
Keywords :
benchmark testing; bioinformatics; computational complexity; evolutionary computation; feature extraction; learning (artificial intelligence); pattern classification; proteins; Fiona S.L. Brinkman; Jennifer L. Gardy; accumulative vote quantity; benchmark dataset; computational intelligence method; ensemble approach; evolutionary fuzzy k-nearest neighbor algorithm; feature extraction; gram-negative bacterial; human genome project; jackknife test result; prediction tool; protein function; protein primitive sequence; protein subcellular localization prediction; pseudo amino acid model; Accuracy; Amino acids; Biomembranes; Classification algorithms; Protein sequence; Training; accumulative vote quantity; ensemble learning; evolutionary fuzzy KNN; jackknife test; protein subcellular location; pseudo amino acid composition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Application and System Modeling (ICCASM), 2010 International Conference on
Conference_Location :
Taiyuan
Print_ISBN :
978-1-4244-7235-2
Electronic_ISBN :
978-1-4244-7237-6
Type :
conf
DOI :
10.1109/ICCASM.2010.5622971
Filename :
5622971
Link To Document :
بازگشت