• DocumentCode
    482165
  • Title

    Preprocessing and Feature Preparation in Chinese Web Page Classification

  • Author

    Huang, Weitong ; Xu, Luxiong ; Liu, Yanmin

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
  • Volume
    1
  • fYear
    2009
  • fDate
    22-24 Jan. 2009
  • Firstpage
    64
  • Lastpage
    67
  • Abstract
    A detailed design and implementation of a Chinese Web-page classification system is described in this paper, and some methods on Chinese Web-page preprocessing and feature preparation are proposed. Experimental results on a Chinese Web-page dataset show that methods we designed can improve the performance from 75.82% to 81.88%.
  • Keywords
    Web sites; classification; natural language processing; Chinese Web page classification; Chinese Web-page dataset; Chinese Web-page preprocessing; feature preparation; Application software; Computer applications; Computer science; Data mining; Design engineering; HTML; Navigation; Particle separators; Vocabulary; Web pages; Chinese web-page preprocessing; Feature preparation; Text classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Engineering and Technology, 2009. ICCET '09. International Conference on
  • Conference_Location
    Singapore
  • Print_ISBN
    978-1-4244-3334-6
  • Type

    conf

  • DOI
    10.1109/ICCET.2009.72
  • Filename
    4769428