• DocumentCode
    586720
  • Title

    An error probability estimation of the document classification using Markov model

  • Author

    Kobayashi, Masato ; Ninomiya, Hiroshi ; Matsushima, Takaaki ; Hirasawa, Shoichi

  • Author_Institution
    Shonan Inst. of Technol., Fujisawa, Japan
  • fYear
    2012
  • fDate
    28-31 Oct. 2012
  • Firstpage
    717
  • Lastpage
    721
  • Abstract
    The document classification problem has been investigated by various techniques, such as a vector space model, a support vector machine, a random forest, and so on. On the other hand, J. Ziv et al. have proposed a document classification method using Ziv-Lempel algorithm to compress the data. Furthermore, the Context-Tree Weighting (CTW) algorithm has been proposed as an outstanding data compression, and for the document classification using the CTW algorithm experimental results have been reported. In this paper, we assume that each document with same category arises from Markov model with same parameters for the document classification. Then we propose an analysis method to estimate a classification error probability for the document with the finite length.
  • Keywords
    Markov processes; data compression; error detection; Markov model; Ziv-Lempel algorithm; classification error probability; context-tree weighting algorithm; document classification; error probability estimation; finite length; outstanding data compression; random forest; support vector machine; vector space model; Approximation methods; Context; Context modeling; Error probability; Estimation; Information theory; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Theory and its Applications (ISITA), 2012 International Symposium on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    978-1-4673-2521-9
  • Type

    conf

  • Filename
    6401034