DocumentCode :
472214
Title :
Protein Classification using Sequential Pattern Mining
Author :
Exarchos, Themis P. ; Papaloukas, Costas ; Lampros, Christos ; Fotiadis, Dimitrios I.
Author_Institution :
Dept. of Comput. Sci., Ioannina Univ.
fYear :
2006
fDate :
Aug. 30 2006-Sept. 3 2006
Firstpage :
5814
Lastpage :
5817
Abstract :
Protein classification in terms of fold recognition can be employed to determine the structural and functional properties of a newly discovered protein. In this work sequential pattern mining (SPM) is utilized for sequence-based fold recognition. One of the most efficient SPM algorithms, cSPADE, is employed for protein primary structure analysis. Then a classifier uses the extracted sequential patterns for classifying proteins of unknown structure in the appropriate fold category. The proposed methodology exhibited an overall accuracy of 36% in a multi-class problem of 17 candidate categories. The classification performance reaches up to 65% when the three most probable protein folds are considered
Keywords :
biology computing; data mining; molecular biophysics; pattern classification; pattern recognition; proteins; cSPADE; protein classification; protein primary structure analysis; sequence-based fold recognition; sequential pattern mining; Algorithm design and analysis; Cities and towns; Data mining; Genomics; Itemsets; Pattern recognition; Proteins; Scanning probe microscopy; Sequences; USA Councils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Engineering in Medicine and Biology Society, 2006. EMBS '06. 28th Annual International Conference of the IEEE
Conference_Location :
New York, NY
ISSN :
1557-170X
Print_ISBN :
1-4244-0032-5
Electronic_ISBN :
1557-170X
Type :
conf
DOI :
10.1109/IEMBS.2006.260336
Filename :
4463129
Link To Document :
بازگشت