Title :
Web categorization using hybrid algorithms
Author :
Ye, Wei-guo ; Lu, Zheng-Ding
Author_Institution :
Dept. of Comput. Sci., Huazhong Univ. of Sci. & Technol., Wuhan, China
Abstract :
Obtaining information from the Web is becoming a very much important issue nowadays. The traditional text categorization algorithm is not sufficient for web categorization. In this paper we discuss the process in Web categorization, and proposed a new information gain measure for feature selections and term weighting. We also discussed three linear classifiers. Then we propose a novel hyperlink based classifier. It uses the characteristics of the Web graph. Experimental comparisons of these algorithms show that our approach is more appropriate than traditional information retrieval methods in Web categorization.
Keywords :
Web sites; classification; feature extraction; information retrieval; learning (artificial intelligence); Web categorization; Web graph; feature selection; hyperlink classifier; information gain measure; information retrieval; learning methods; term weighting; Artificial intelligence; Computer science; Information resources; Information retrieval; Machine learning; Resumes; Spatial databases; Text categorization; Web pages; Web sites;
Conference_Titel :
Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on
Print_ISBN :
0-7803-7508-4
DOI :
10.1109/ICMLC.2002.1174529