DocumentCode :
402856
Title :
A Web document clustering algorithm based on concept of neighbor
Author :
Song, Jiang-Chun ; Shen, Jun-Yi
Author_Institution :
Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., China
Volume :
1
fYear :
2003
fDate :
2-5 Nov. 2003
Firstpage :
46
Abstract :
As the WWW developed rapidly, it becomes the most important resource gradually that transfers and shares the global information as well as being full of the latent capacity. Recent years, the researches of the Web mining have been concerned broadly and gotten a great deal of achievements simultaneously. The nearest neighbor technique, which is a hierarchical clustering method based on distance has been applied to many cases widely for the efficiency and validity. In this paper, based on the vector space model (VSM) of the Web documents, we improved the nearest neighbor method, put forward a new Web document clustering algorithm, and researched the validity and scalability of the algorithm, the time and space complexity of the algorithm.
Keywords :
Web sites; computational complexity; data mining; information retrieval systems; unsupervised learning; Web document clustering algorithm; Web mining; World Wide Web; global information; nearest neighbor method; space complexity; time complexity; unsupervised learning; vector space model; Clustering algorithms; Clustering methods; Computer science; Data mining; Nearest neighbor searches; Pattern analysis; Scalability; Unsupervised learning; Web mining; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
Type :
conf
DOI :
10.1109/ICMLC.2003.1264440
Filename :
1264440
Link To Document :
بازگشت