DocumentCode :
2445111
Title :
Biosequence Classification using Sequential Pattern Mining and Optimization
Author :
Fotiadis, D.I. ; Exarchos, T.P. ; Tsipouras, M.G. ; Papaloukas, C.
Author_Institution :
Univ. of Ioannina, Ioannina
fYear :
2007
fDate :
8-11 Nov. 2007
Firstpage :
58
Lastpage :
61
Abstract :
In this paper we present a methodology for biosequence classification, which employs sequential pattern mining and optimization algorithms. In the first stage, a sequential pattern mining algorithm is applied to a set of biological sequences and the sequential patterns are extracted. Then, the score of each pattern with respect to each sequence is calculated using a scoring function and the score of each class under consideration is estimated. The scores of the patterns and classes are updated, multiplied by a weight. In the second stage an optimization technique is employed to calculate the weight values to achieve the optimal classification accuracy. The methodology is applied in the protein class and fold prediction problem. Extensive evaluation is carried out, using a dataset obtained from the Protein Data Bank.
Keywords :
biology computing; data mining; optimisation; pattern classification; biosequence classification; optimization algorithm; protein data bank; scoring function; sequential pattern mining; Application software; Biology; Computer science; DNA; Information systems; Intelligent systems; Optimization methods; Proteins; Sequences; Text categorization; Sequential pattern mining; biosequence classification; optimization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology Applications in Biomedicine, 2007. ITAB 2007. 6th International Special Topic Conference on
Conference_Location :
Tokyo
Print_ISBN :
978-1-4244-1868-8
Electronic_ISBN :
978-1-4244-1868-8
Type :
conf
DOI :
10.1109/ITAB.2007.4407423
Filename :
4407423
Link To Document :
بازگشت