DocumentCode :
2061134
Title :
Document image similarity and equivalence detection
Author :
Hull, Jonathan J. ; Cullen, John F.
Author_Institution :
Ricoh California Res. Center, Menlo Park, CA, USA
Volume :
1
fYear :
1997
fDate :
18-20 Aug 1997
Firstpage :
308
Abstract :
A hierarchical algorithm is presented for determining the similarity and equivalence of document images. Features extracted from the CCITT fax compressed representations of two images are compared to determine their visual similarity and whether they are equivalent. Pass codes in the compressed data are used as features. A fixed grid is imposed on the image and a feature vector is derived from the number of pass codes in each grid cell. The feature vectors are compared to locate a group of documents that are visually similar to the input image. The equivalence of two documents is determined by applying the Hausdorff distance to the two dimensional arrangement of pass codes in small patches of each image
Keywords :
document image processing; facsimile; feature extraction; image coding; image representation; telecommunication standards; CCITT fax compressed representations; Hausdorff distance; compressed data; document image similarity; document images; equivalence detection; feature extraction; feature vector; fixed grid; grid cell; hierarchical algorithm; pass codes; small patches; two dimensional arrangement; visual similarity; Business; Data mining; Feature extraction; Grid computing; Image analysis; Image coding; Image databases; Spatial databases; Text analysis; Visual databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location :
Ulm
Print_ISBN :
0-8186-7898-4
Type :
conf
DOI :
10.1109/ICDAR.1997.619862
Filename :
619862
Link To Document :
بازگشت