DocumentCode :
423975
Title :
A mutual information kernel for sequences
Author :
Cuturi, Marco ; Vert, Jean-Philippe
Author_Institution :
Comput. Biol. Group, Ecole des Mines de Paris, Fontainebleau, France
Volume :
3
fYear :
2004
fDate :
25-29 July 2004
Firstpage :
1905
Abstract :
We propose a new kernel for strings which borrows ideas and techniques from information theory and data compression. This kernel can be used in combination with any kernel method, in particular support vector machines for protein classification. By incorporating prior assumptions on the properties of the alphabet and using a Bayesian averaging framework, we compute the value of this kernel in linear time and space, benefiting from previous achievements proposed in the field of universal coding. Encouraging classification results are reported on a standard protein homology detection experiment.
Keywords :
Bayes methods; biocomputing; pattern classification; proteins; sequences; support vector machines; Bayesian averaging framework; data compression; information theory; kernel method; mutual information kernel; protein classification; protein homology detection; protein sequences; support vector machines; Biological system modeling; Biology computing; Computational biology; Hidden Markov models; Kernel; Mutual information; Proteins; Sequences; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
ISSN :
1098-7576
Print_ISBN :
0-7803-8359-1
Type :
conf
DOI :
10.1109/IJCNN.2004.1380902
Filename :
1380902
Link To Document :
بازگشت