Title :
A Web Page Classification Algorithm Based on Link Information
Author :
Xu, Zhaohui ; Yan, Fuliang ; Qin, Jie ; Zhu, Haifeng
Author_Institution :
Sch. of Comput., Wuhan Univ. of Technol., Wuhan, China
Abstract :
Effective classification of web pages can improve the quality of information retrieval. The traditional classification algorithms are basically based on the analysis of Web content, but the content of the web page is complicated, filled with a large number of false, erroneous information, has seriously affected the accuracy of the classification of network information. To solve this problem, this paper presents a web page classification algorithm, Link Information Categorization(LIC). Based on the K nearest neighbor method, it combines information on the website features, to implement the Web page link to information classification. Experiments show that the algorithm can get higher efficiency and accuracy on the Web page classification.
Keywords :
Web sites; information retrieval; pattern classification; K nearest neighbor method; Web content analysis; Web page classification algorithm; Web page link; Web site features; information retrieval; link information categorization; network information classification; Accuracy; Algorithm design and analysis; Classification algorithms; Internet; Support vector machines; Text categorization; Web pages; Link Information; Link Information Categorization; Web Page Classification;
Conference_Titel :
Distributed Computing and Applications to Business, Engineering and Science (DCABES), 2011 Tenth International Symposium on
Conference_Location :
Wuxi
Print_ISBN :
978-1-4577-0327-0
DOI :
10.1109/DCABES.2011.19