• DocumentCode
    2276484
  • Title

    Text Categorization Research Based on Cluster Idea

  • Author

    Lin, Jialun ; Li, Xiaoling ; Jiao, Yuan

  • Author_Institution
    Comput. Teaching & Res. Sect., Hainan Med. Coll., Haikou, China
  • Volume
    1
  • fYear
    2010
  • fDate
    6-7 March 2010
  • Firstpage
    483
  • Lastpage
    486
  • Abstract
    Classification and clustering are frequently-used methods in data excavation technology. This paper introduces the idea of text clustering into the categorization algorithm study. The authors also attempt to use the text categorization pattern of self´-initiated learning to design a clustering-based text categorization algorithm, in the purpose of reducing the dimension of training set and raising the efficiency of categorization implement. A series of experiments prove that this algorithm can greatly raise the efficiency while slightly reducing the accuracy of categorization, and then balance the contradiction between them.
  • Keywords
    pattern classification; pattern clustering; statistical analysis; text analysis; unsupervised learning; word processing; cluster idea; clustering-based text categorization algorithm; data excavation; self-initiated learning; text categorization pattern; text clustering; training set; Algorithm design and analysis; Clustering algorithms; Clustering methods; Computer science; Computer science education; Educational technology; Management training; Partitioning algorithms; Testing; Text categorization; K-Means algorithm; KNN algorithm; text categorization; text clustering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Education Technology and Computer Science (ETCS), 2010 Second International Workshop on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-6388-6
  • Electronic_ISBN
    978-1-4244-6389-3
  • Type

    conf

  • DOI
    10.1109/ETCS.2010.413
  • Filename
    5458527