Title :
Adaptive segmentation of document images
Author :
Sylwester, Don ; Seth, Sharad
Author_Institution :
Concordia Univ., NE, USA
fDate :
6/23/1905 12:00:00 AM
Abstract :
A single-parameter text-line extraction algorithm is described along with an efficient technique for estimating the optimal value for the parameter for individual images without need for ground truth. The algorithm is based on three simple tree operations, cut, glue and flip. An XY-tree representing the segmentation is incrementally transformed to reflect a change in the parameter while intrinsic measures of the cost of the transformation are used to detect when specific tree operations would cause an error if they were performed, allowing these errors to be avoided. The algorithm correctly identified 98.8% of the area of the ground truth bounding boxes and committed no column bridging errors on a set of 97 test images selected from a variety of technical journals
Keywords :
document image processing; feature extraction; image segmentation; XY-tree; adaptive algorithm; document image analysis; performance; segmentation; text-line extraction; tree modifications; Change detection algorithms; Costs; Error correction; Image analysis; Image segmentation; Merging; Performance evaluation; Pixel; Testing; Text analysis;
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
DOI :
10.1109/ICDAR.2001.953903