• DocumentCode
    2227709
  • Title

    A text mining approach on automatic generation of Web directories and hierarchies

  • Author

    Yang, Hsin-Chang ; Lee, Chung-Hong

  • Author_Institution
    Dept. of Inf. Manage., Chang Jung Univ., Tainan, Taiwan
  • fYear
    2003
  • fDate
    13-17 Oct. 2003
  • Firstpage
    625
  • Lastpage
    628
  • Abstract
    There are enormous amount of Web pages in the world. Retrieval of required information from the WWW is thus an arduous task. Different models for retrieving Web pages have been used by the WWW community. One of the most widely used model is by traversing a predefined Web directory hierarchy to reach a user´s goal. The Web directories are compiled or classified folders of Web pages and are usually organized into a hierarchical structure. The classification of Web pages into proper directories and the organization of directory hierarchies are generally performed by human experts. We provide a method to apply a kind of text mining techniques on a set of Web pages to automatically create Web directories and organize them into hierarchies. The method is based on the self-organizing map learning algorithm and requires no human intervention during the construction of Web directories and hierarchies. The experiments show that our method can produce comprehensible and reasonable Web directories and hierarchies.
  • Keywords
    Web sites; data mining; information retrieval; learning (artificial intelligence); self-organising feature maps; Web directory; Web page; human expert; information retrieval; self-organizing map learning algorithm; text mining; Clustering algorithms; Humans; Information management; Information retrieval; Labeling; Neural networks; Neurons; Text mining; Web pages; World Wide Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2003. WI 2003. Proceedings. IEEE/WIC International Conference on
  • Print_ISBN
    0-7695-1932-6
  • Type

    conf

  • DOI
    10.1109/WI.2003.1241282
  • Filename
    1241282