Title :
Layout based document image retrieval by means of XY tree reduction
Author :
Marinai, Simone ; Marino, Emanuele ; Soda, Giovanni
Author_Institution :
DSI, Florence Univ., Italy
fDate :
29 Aug.-1 Sept. 2005
Abstract :
We analyze a system for the retrieval of document images on the basis of layout similarity. Layout objects are extracted and represented with the XY tree. Page similarity is computed with a tree-edit distance algorithm. The peculiarity of the approach is the use of tree grammars to model the variations in the tree, which are due to segmentation algorithms or to structural differences between documents with similar layout. A few class-independent grammatical rules are used to modify each tree and obtain a reduced tree that is supposed to preserve the most relevant features of the page.
Keywords :
digital libraries; document image processing; grammars; image retrieval; trees (mathematics); XY tree reduction; class-independent grammatical rules; digital libraries; layout based document image retrieval; layout similarity; page similarity; tree grammars; tree-edit distance algorithm; Classification tree analysis; Grid computing; Image analysis; Image retrieval; Image segmentation; Indexing; Information retrieval; Performance evaluation; Software libraries; Testing;
Conference_Titel :
Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
Print_ISBN :
0-7695-2420-6
DOI :
10.1109/ICDAR.2005.150