DocumentCode
3088172
Title
Cost-benefit analysis of Web bag in a Web warehouse
Author
Bhowmick, Sourav S. ; Madria, Sanjay ; Ng, Wee-Keong ; Lim, Ee-Peng
Author_Institution
Centre for Adv. Inf. Syst., Nanyang Technol. Univ., Singapore
fYear
1999
fDate
36373
Firstpage
34
Lastpage
42
Abstract
Sets and bags are closely related structures and have been studied in relational databases. A bag is different from a set in that it is sensitive to the number of times an element occurs, while a set is not. In this paper, we introduce the concept of a Web bag in the context of a World Wide Web warehouse called WHOWEDA (WareHouse Of WEb DAta) which we are currently building. Informally, a Web bag is a Web table which allows multiple occurrences of identical Web types. A Web bag helps one to discover useful knowledge from a Web table, such as visible documents or Web sites (i.e. documents/sites which can be reached by many paths), luminous documents (i.e. documents with many outgoing links) and luminous paths (i.e. frequently traversed paths). In this paper, we provide a cost-benefit analysis of materializing Web bags as compared to Web tables with distinct Web tuples
Keywords
cost-benefit analysis; data mining; data structures; data warehouses; information resources; search engines; WHOWEDA; Web bags; Web tables; Web tuples; World Wide Web warehouse; cost-benefit analysis; element occurrence; fan-in; fan-out; frequently traversed paths; identical Web types; luminous documents; luminous paths; outgoing links; useful knowledge discovery; visible Web sites; visible documents; Computer science; Cost benefit analysis; Current measurement; Educational technology; Hard disks; Information systems; Read only memory; Search engines; World Wide Web;
fLanguage
English
Publisher
ieee
Conference_Titel
Database Engineering and Applications, 1999. IDEAS '99. International Symposium Proceedings
Conference_Location
Montreal, Que.
Print_ISBN
0-7695-0265-2
Type
conf
DOI
10.1109/IDEAS.1999.787249
Filename
787249
Link To Document