DocumentCode
2445111
Title
Biosequence Classification using Sequential Pattern Mining and Optimization
Author
Fotiadis, D.I. ; Exarchos, T.P. ; Tsipouras, M.G. ; Papaloukas, C.
Author_Institution
Univ. of Ioannina, Ioannina
fYear
2007
fDate
8-11 Nov. 2007
Firstpage
58
Lastpage
61
Abstract
In this paper we present a methodology for biosequence classification, which employs sequential pattern mining and optimization algorithms. In the first stage, a sequential pattern mining algorithm is applied to a set of biological sequences and the sequential patterns are extracted. Then, the score of each pattern with respect to each sequence is calculated using a scoring function and the score of each class under consideration is estimated. The scores of the patterns and classes are updated, multiplied by a weight. In the second stage an optimization technique is employed to calculate the weight values to achieve the optimal classification accuracy. The methodology is applied in the protein class and fold prediction problem. Extensive evaluation is carried out, using a dataset obtained from the Protein Data Bank.
Keywords
biology computing; data mining; optimisation; pattern classification; biosequence classification; optimization algorithm; protein data bank; scoring function; sequential pattern mining; Application software; Biology; Computer science; DNA; Information systems; Intelligent systems; Optimization methods; Proteins; Sequences; Text categorization; Sequential pattern mining; biosequence classification; optimization;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Technology Applications in Biomedicine, 2007. ITAB 2007. 6th International Special Topic Conference on
Conference_Location
Tokyo
Print_ISBN
978-1-4244-1868-8
Electronic_ISBN
978-1-4244-1868-8
Type
conf
DOI
10.1109/ITAB.2007.4407423
Filename
4407423
Link To Document