DocumentCode
2273668
Title
Structural abstractions of hypertext documents for Web-based retrieval
Author
Deogun, Jitender S. ; Sever, Hayri ; Ragh, Vijay V.
Author_Institution
Dept. of Comput. Sci. & Eng., Nebraska Univ., Lincoln, NE, USA
fYear
1998
fDate
25-28 Aug 1998
Firstpage
385
Lastpage
390
Abstract
There have been conflicting views in the literature on the capability of tools and mechanisms for storing and accessing information over Internet. On one hand it has been claimed for a long time that World Wide Web offers a chaotic environment for Web agents to extract information because the description of a document by HTML is easily comprehensible by humans, but is not so by machines. On the other hand, it has been hypothesized that information is sufficiently structured to facilitate effective Web mining, especially for electronic catalogs. In this article we do not intend to take position on this matter, but rather investigate the performance of a search engine while indexing more logical elements of HTML documents and while increasing the scope of indexing process
Keywords
Internet; hypermedia; indexing; information retrieval; HTML; Internet; Web-based retrieval; World Wide Web; electronic catalogs; hypertext documents; structural abstractions; Chaos; Data mining; Electronic catalog; HTML; Humans; Indexing; Internet; Search engines; Web mining; Web sites;
fLanguage
English
Publisher
ieee
Conference_Titel
Database and Expert Systems Applications, 1998. Proceedings. Ninth International Workshop on
Conference_Location
Vienna
Print_ISBN
0-8186-8353-8
Type
conf
DOI
10.1109/DEXA.1998.707429
Filename
707429
Link To Document