DocumentCode :
3307186
Title :
Ground-truthing and benchmarking document page segmentation
Author :
Yanikoglu, Berrin A. ; Vincent, Luc
Author_Institution :
Xerox Imaging Systems, USA
Volume :
2
fYear :
1995
fDate :
14-16 Aug 1995
Firstpage :
601
Abstract :
We describe a new approach for evaluating page segmentation algorithms. Unlike techniques that rely on OCR output, our method is region-based: the segmentation output, described as a set of regions together with their types, output order etc., is matched against the pre-stored set of ground-truth regions. Misclassifications, splitting, and merging of regions are among the errors that are detected by the system. Each error is weighted individually for a particular application and a global estimate of segmentation quality is derived. The system can be customized to benchmark specific aspects of segmentation (e.g., headline detection) and according to the type of error correction that might follow (e.g., re-typing). Segmentation ground-truth files are quickly and easily generated and edited using GroundsKeeper, an X-Window based tool that allows one to view a document, manually draw regions (arbitrary polygons) on it, and specify information about each region (e.g., type, parent)
Keywords :
document image processing; errors; graphical user interfaces; image classification; image matching; image segmentation; merging; optical character recognition; software performance evaluation; GroundsKeeper; OCR output; X-Window; arbitrary polygons; benchmarking; customization; document page segmentation; document view; errors; ground-truthing; headline detection; image regions; region merging; region misclassifications; region splitting; region-based method; retyping; segmentation quality; Costs; Error analysis; Error correction; Graphics; Image segmentation; Labeling; Merging; Optical character recognition software; Postal services; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-8186-7128-9
Type :
conf
DOI :
10.1109/ICDAR.1995.601968
Filename :
601968
Link To Document :
بازگشت