• DocumentCode
    423975
  • Title

    A mutual information kernel for sequences

  • Author

    Cuturi, Marco ; Vert, Jean-Philippe

  • Author_Institution
    Comput. Biol. Group, Ecole des Mines de Paris, Fontainebleau, France
  • Volume
    3
  • fYear
    2004
  • fDate
    25-29 July 2004
  • Firstpage
    1905
  • Abstract
    We propose a new kernel for strings which borrows ideas and techniques from information theory and data compression. This kernel can be used in combination with any kernel method, in particular support vector machines for protein classification. By incorporating prior assumptions on the properties of the alphabet and using a Bayesian averaging framework, we compute the value of this kernel in linear time and space, benefiting from previous achievements proposed in the field of universal coding. Encouraging classification results are reported on a standard protein homology detection experiment.
  • Keywords
    Bayes methods; biocomputing; pattern classification; proteins; sequences; support vector machines; Bayesian averaging framework; data compression; information theory; kernel method; mutual information kernel; protein classification; protein homology detection; protein sequences; support vector machines; Biological system modeling; Biology computing; Computational biology; Hidden Markov models; Kernel; Mutual information; Proteins; Sequences; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-8359-1
  • Type

    conf

  • DOI
    10.1109/IJCNN.2004.1380902
  • Filename
    1380902