DocumentCode
3483389
Title
An encoding technique based on word importance for the clustering of Web documents
Author
Zakos, J. ; Verma, Brijesh
Author_Institution
Sch. of Inf. Technol., Griffith Univ., Australia
Volume
5
fYear
2002
fDate
18-22 Nov. 2002
Firstpage
2207
Abstract
We present a word encoding and clustering technique that groups Web documents based on the importance of the words that appear in the documents. We use a two level self-organizing map architecture to generate clusters of words and documents. We propose that by capturing word importance information of words, similar documents can be then clustered to assist in Web document retrieval. A Web document retrieval system is presented to demonstrate how this approach could. be integrated into Web search.
Keywords
Internet; encoding; information retrieval; pattern clustering; search engines; self-organising feature maps; word processing; Web document clustering; Web document retrieval system; encoding technique; two level self-organizing map architecture; word encoding; word importance; word importance information; Encoding; Gold; Histograms; Information processing; Information retrieval; Information technology; Internet; Search engines; Self organizing feature maps; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
Print_ISBN
981-04-7524-1
Type
conf
DOI
10.1109/ICONIP.2002.1201885
Filename
1201885
Link To Document