DocumentCode :
2324188
Title :
BlockWeb: An IR Model for Block Structured Web Pages
Author :
Bruno, Emmanuel ; Faessel, Nicolas ; Le Maitre, J. ; Scholl, Michel
Author_Institution :
LSIS, Univ. du Sud Toulon-Var, La Garde
fYear :
2009
fDate :
3-5 June 2009
Firstpage :
219
Lastpage :
224
Abstract :
BlockWeb is a model that we have developed for indexing and querying web pages according to their content as well as to their visual rendering. These pages are split up into blocks what has several advantages in terms of page indexing and querying: (i) blocks of a page most similar to a query may be returned instead of the page as a whole (ii) the importance of a block can be taken into account, as well as (iii) the permeability of the blocks to the content of neighbor blocks. In this paper, we present the BlockWeb model and show its interest for indexing images of Web pages, through an experiment performed on electronic versions of French daily newspapers. We also present the engine we have implemented for block extraction, indexing and querying according to the BlockWeb model.
Keywords :
Web sites; indexing; query processing; rendering (computer graphics); BlockWeb; IR model; Web page indexing; Web page querying; block structured Web pages; visual rendering; Data mining; Data models; Engines; Indexing; Large scale integration; Permeability; Rendering (computer graphics); Vocabulary; Web pages; XML; block decomposition; image indexing; propagation; web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Content-Based Multimedia Indexing, 2009. CBMI '09. Seventh International Workshop on
Conference_Location :
Chania
Print_ISBN :
978-1-4244-4265-2
Electronic_ISBN :
978-0-7695-3662-0
Type :
conf
DOI :
10.1109/CBMI.2009.36
Filename :
5137844
Link To Document :
بازگشت