DocumentCode
482165
Title
Preprocessing and Feature Preparation in Chinese Web Page Classification
Author
Huang, Weitong ; Xu, Luxiong ; Liu, Yanmin
Author_Institution
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing
Volume
1
fYear
2009
fDate
22-24 Jan. 2009
Firstpage
64
Lastpage
67
Abstract
A detailed design and implementation of a Chinese Web-page classification system is described in this paper, and some methods on Chinese Web-page preprocessing and feature preparation are proposed. Experimental results on a Chinese Web-page dataset show that methods we designed can improve the performance from 75.82% to 81.88%.
Keywords
Web sites; classification; natural language processing; Chinese Web page classification; Chinese Web-page dataset; Chinese Web-page preprocessing; feature preparation; Application software; Computer applications; Computer science; Data mining; Design engineering; HTML; Navigation; Particle separators; Vocabulary; Web pages; Chinese web-page preprocessing; Feature preparation; Text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Engineering and Technology, 2009. ICCET '09. International Conference on
Conference_Location
Singapore
Print_ISBN
978-1-4244-3334-6
Type
conf
DOI
10.1109/ICCET.2009.72
Filename
4769428
Link To Document