DocumentCode
472214
Title
Protein Classification using Sequential Pattern Mining
Author
Exarchos, Themis P. ; Papaloukas, Costas ; Lampros, Christos ; Fotiadis, Dimitrios I.
Author_Institution
Dept. of Comput. Sci., Ioannina Univ.
fYear
2006
fDate
Aug. 30 2006-Sept. 3 2006
Firstpage
5814
Lastpage
5817
Abstract
Protein classification in terms of fold recognition can be employed to determine the structural and functional properties of a newly discovered protein. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. One of the most efficient SPM algorithms, cSPADE, is employed for protein primary structure analysis. Then a classifier uses the extracted sequential patterns for classifying proteins of unknown structure in the appropriate fold category. The proposed methodology exhibited an overall accuracy of 36% in a multi-class problem of 17 candidate categories. The classification performance reaches up to 65% when the three most probable protein folds are considered
Keywords
biology computing; data mining; molecular biophysics; pattern classification; pattern recognition; proteins; cSPADE; protein classification; protein primary structure analysis; sequence-based fold recognition; sequential pattern mining; Algorithm design and analysis; Cities and towns; Data mining; Genomics; Itemsets; Pattern recognition; Proteins; Scanning probe microscopy; Sequences; USA Councils;
fLanguage
English
Publisher
ieee
Conference_Titel
Engineering in Medicine and Biology Society, 2006. EMBS '06. 28th Annual International Conference of the IEEE
Conference_Location
New York, NY
ISSN
1557-170X
Print_ISBN
1-4244-0032-5
Electronic_ISBN
1557-170X
Type
conf
DOI
10.1109/IEMBS.2006.260336
Filename
4463129
Link To Document