• DocumentCode
    3483389
  • Title

    An encoding technique based on word importance for the clustering of Web documents

  • Author

    Zakos, J. ; Verma, Brijesh

  • Author_Institution
    Sch. of Inf. Technol., Griffith Univ., Australia
  • Volume
    5
  • fYear
    2002
  • fDate
    18-22 Nov. 2002
  • Firstpage
    2207
  • Abstract
    We present a word encoding and clustering technique that groups Web documents based on the importance of the words that appear in the documents. We use a two level self-organizing map architecture to generate clusters of words and documents. We propose that by capturing word importance information of words, similar documents can be then clustered to assist in Web document retrieval. A Web document retrieval system is presented to demonstrate how this approach could. be integrated into Web search.
  • Keywords
    Internet; encoding; information retrieval; pattern clustering; search engines; self-organising feature maps; word processing; Web document clustering; Web document retrieval system; encoding technique; two level self-organizing map architecture; word encoding; word importance; word importance information; Encoding; Gold; Histograms; Information processing; Information retrieval; Information technology; Internet; Search engines; Self organizing feature maps; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
  • Print_ISBN
    981-04-7524-1
  • Type

    conf

  • DOI
    10.1109/ICONIP.2002.1201885
  • Filename
    1201885