• DocumentCode
    3202230
  • Title

    The Capability Analysis on the Characteristic Selection Algorithm of Text Categorization Based on F1 Measure Value

  • Author

    He Shaojun ; Cao Jin ; Guo Ruixu ; Wang Guijun

  • Author_Institution
    Northern Electron. Instrum. Inst., Beijing, China
  • fYear
    2012
  • fDate
    8-10 Dec. 2012
  • Firstpage
    742
  • Lastpage
    746
  • Abstract
    The text categorization is an important aspect in the processing of nature languages. It can be used to identify the categorization information within the nature languages, consequently, the clutter problem, directional detection and scout of information has been solved. The general processing of text categorization is proposed in this paper. Taken Sogou datasets as the target, the capability of several typical characteristic selection algorithms have been analyzed in KNN classification machine with different characteristic dimensions and classification methods, while the text categorization experiment is based on F1 measure value.
  • Keywords
    natural language processing; pattern classification; text analysis; F1 measure value; KNN classification machine; Sogou datasets; capability analysis; categorization information; characteristic dimensions; characteristic selection algorithm; classification methods; directional detection; nature language processing; text categorization; text categorization experiment; Algorithm design and analysis; Classification algorithms; Computational modeling; Data models; Internet; Support vector machines; Text categorization; Characteristic Dimension; Characteristic Selection; F1 measure value; KNN Classification Machine; text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Instrumentation, Measurement, Computer, Communication and Control (IMCCC), 2012 Second International Conference on
  • Conference_Location
    Harbin
  • Print_ISBN
    978-1-4673-5034-1
  • Type

    conf

  • DOI
    10.1109/IMCCC.2012.180
  • Filename
    6429015