• DocumentCode
    2024321
  • Title

    The design and implementation of an excellent text categorization system

  • Author

    Lu, Mingyu ; Diao, LiLi ; Lu, Yuchang ; Zhou, Lizhu

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • Volume
    1
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    459
  • Abstract
    Based on the study of text classification techniques, a new text categorization method which uses a weight adjustment measure to improve a vector space model and naive Bayesian classifier is proposed, and an experimental text classification system CWZ is implemented to make comparison within various text classification approaches. Compared with many commercial text classification systems, the behavior of CWZ is much better. We introduce its framework, function, main modules and running environment, give our experimental results, and discuss a few important technical issues involved in the system to get some valuable conclusions. We also describe how to improve the vector space model and naive Bayesian classifier.
  • Keywords
    Bayes methods; classification; expert systems; learning (artificial intelligence); text analysis; CWZ; naive Bayesian classifier; technical issues; text categorization system; text classification techniques; vector space model; weight adjustment measure; Automation; Bayesian methods; Computer science; Electronic mail; Extraterrestrial measurements; Intelligent control; Space technology; Text categorization; Text mining; Weight measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Control and Automation, 2002. Proceedings of the 4th World Congress on
  • Print_ISBN
    0-7803-7268-9
  • Type

    conf

  • DOI
    10.1109/WCICA.2002.1022151
  • Filename
    1022151