DocumentCode :
2273668
Title :
Structural abstractions of hypertext documents for Web-based retrieval
Author :
Deogun, Jitender S. ; Sever, Hayri ; Ragh, Vijay V.
Author_Institution :
Dept. of Comput. Sci. & Eng., Nebraska Univ., Lincoln, NE, USA
fYear :
1998
fDate :
25-28 Aug 1998
Firstpage :
385
Lastpage :
390
Abstract :
There have been conflicting views in the literature on the capability of tools and mechanisms for storing and accessing information over Internet. On one hand it has been claimed for a long time that World Wide Web offers a chaotic environment for Web agents to extract information because the description of a document by HTML is easily comprehensible by humans, but is not so by machines. On the other hand, it has been hypothesized that information is sufficiently structured to facilitate effective Web mining, especially for electronic catalogs. In this article we do not intend to take position on this matter, but rather investigate the performance of a search engine while indexing more logical elements of HTML documents and while increasing the scope of indexing process
Keywords :
Internet; hypermedia; indexing; information retrieval; HTML; Internet; Web-based retrieval; World Wide Web; electronic catalogs; hypertext documents; structural abstractions; Chaos; Data mining; Electronic catalog; HTML; Humans; Indexing; Internet; Search engines; Web mining; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database and Expert Systems Applications, 1998. Proceedings. Ninth International Workshop on
Conference_Location :
Vienna
Print_ISBN :
0-8186-8353-8
Type :
conf
DOI :
10.1109/DEXA.1998.707429
Filename :
707429
Link To Document :
بازگشت