• DocumentCode
    2326961
  • Title

    Improving the k-NN and applying it to Chinese text classification

  • Author

    Yuan, Fang ; Yang, Liu ; Yu, Ge

  • Author_Institution
    Coll. of Math. & Comput. Sci., Hebei Univ., China
  • Volume
    3
  • fYear
    2005
  • fDate
    18-21 Aug. 2005
  • Firstpage
    1547
  • Abstract
    With the problems of applying k-NN to Chinese text classification, this paper gives some improvements on k-NN. Word segmentation based on dictionaries and statistics can increase the accuracy of the classification and reduce the number of dimensions. Applying genetic algorithm to learn the value of k can improve classification automatization. The gradual classification mode is good for improving classification efficiency. The experiment shows that those improvements on k-NN can improve the efficiency of Chinese text classification while maintain the higher accuracy.
  • Keywords
    classification; genetic algorithms; text analysis; Chinese text classification; classification automatization; genetic algorithm; k-nearest neighbor; word segmentation; Computer science; Educational institutions; Electronic mail; Genetic algorithms; Information science; Internet; Mathematics; Statistics; Testing; Text categorization; Chinese text classification; genetic algorithm; gradual classification mode; k-Nearest Neighbor method; text preprocessing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
  • Conference_Location
    Guangzhou, China
  • Print_ISBN
    0-7803-9091-1
  • Type

    conf

  • DOI
    10.1109/ICMLC.2005.1527190
  • Filename
    1527190