DocumentCode :
3315604
Title :
Simultaneous word segmentation from document images using recursive morphological closing transform
Author :
Chen, Su ; Haralick, Robert M. ; Phillips, Ihsin T.
Author_Institution :
Dept. of Electr. Eng., Washington Univ., Seattle, WA, USA
Volume :
2
fYear :
1995
fDate :
14-16 Aug 1995
Firstpage :
761
Abstract :
This paper describes a word segmentation algorithm which is based on the recursive morphological closing transform. The algorithm is trainable for any given document image population and is capable of detecting words on a document image simultaneously. We describe an experimental protocol to train and evaluate our word segmentation algorithm based on a set of layout ground-truthed document images. We also discussed a method to compare two sets of word bounding boxes-one from the ground truth and the other from the output of the word segmentation algorithm, and compute the numbers of miss, false, correct splitting, merging and spurious detections. The experimental results demonstrate that under the optimal algorithm parameter settings, the correct word detection percentage is about 95% on both the training and testing image populations. If this includes the splitting and merging detections, the detection percentage is about 99.4%
Keywords :
document image processing; image segmentation; mathematical morphology; protocols; visual databases; document image population; document images; layout ground-truthed document images; optimal algorithm parameter settings; protocol; recursive morphological closing transform; simultaneous word segmentation; Computer science; Data mining; Image resolution; Image segmentation; Markov random fields; Pixel; Shape; White spaces;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-8186-7128-9
Type :
conf
DOI :
10.1109/ICDAR.1995.602014
Filename :
602014
Link To Document :
بازگشت