• DocumentCode
    2273668
  • Title

    Structural abstractions of hypertext documents for Web-based retrieval

  • Author

    Deogun, Jitender S. ; Sever, Hayri ; Ragh, Vijay V.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Nebraska Univ., Lincoln, NE, USA
  • fYear
    1998
  • fDate
    25-28 Aug 1998
  • Firstpage
    385
  • Lastpage
    390
  • Abstract
    There have been conflicting views in the literature on the capability of tools and mechanisms for storing and accessing information over Internet. On one hand it has been claimed for a long time that World Wide Web offers a chaotic environment for Web agents to extract information because the description of a document by HTML is easily comprehensible by humans, but is not so by machines. On the other hand, it has been hypothesized that information is sufficiently structured to facilitate effective Web mining, especially for electronic catalogs. In this article we do not intend to take position on this matter, but rather investigate the performance of a search engine while indexing more logical elements of HTML documents and while increasing the scope of indexing process
  • Keywords
    Internet; hypermedia; indexing; information retrieval; HTML; Internet; Web-based retrieval; World Wide Web; electronic catalogs; hypertext documents; structural abstractions; Chaos; Data mining; Electronic catalog; HTML; Humans; Indexing; Internet; Search engines; Web mining; Web sites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database and Expert Systems Applications, 1998. Proceedings. Ninth International Workshop on
  • Conference_Location
    Vienna
  • Print_ISBN
    0-8186-8353-8
  • Type

    conf

  • DOI
    10.1109/DEXA.1998.707429
  • Filename
    707429