• DocumentCode
    2959213
  • Title

    Classifiers based on Bernoulli mixture models for text mining and handwriting recognition tasks

  • Author

    Saeed, Mehreen ; Babri, Haroon

  • Author_Institution
    Nat. Univ. of Comput. & Emerging Sci., Lahore
  • fYear
    2008
  • fDate
    1-8 June 2008
  • Firstpage
    2169
  • Lastpage
    2175
  • Abstract
    In this paper we describe a model for classifying binary data using classifiers based on Bernoulli mixture models. We show how Bernoulli mixtures can be used for feature extraction and dimensionality reduction of raw input data. The extracted features are then used for training a classifier for supervised labeling of individual sample points. We have applied this method to two different types of datasets, i.e., one from the text mining domain and one from the handwriting recognition area. Empirical experiments demonstrate that we can obtain up to 99.9% reduction in the dimensionality of the original feature set for sparse binary features. Classification accuracy also increases considerably when the combined model is used. This paper compares the performance of different classification algorithms when used in conjunction with the new feature set generated by Bernoulli mixtures. Using this hybrid model of learning we have achieved one of the best accuracy rates on the NOVA and GINA datasets of the dasiaagnostic vs. prior knowledgepsila competition held by the International Joint Conference on Neural Networks in 2007.
  • Keywords
    data mining; feature extraction; handwriting recognition; learning (artificial intelligence); neural nets; text analysis; Bernoulli mixture models; binary data Classification; dimensionality reduction; feature extraction; handwriting recognition tasks; neural networks; supervised labeling; text mining; Classification algorithms; Data mining; Feature extraction; Handwriting recognition; Labeling; Neural networks; Support vector machine classification; Support vector machines; Text categorization; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1820-6
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2008.4634097
  • Filename
    4634097