DocumentCode :
183426
Title :
Automatic Line Segmentation and Ground-Truth Alignment of Handwritten Documents
Author :
Bluche, Theodore ; Moysset, Bastien ; Kermorvant, Christopher
Author_Institution :
A2iA SA, Paris, France
fYear :
2014
fDate :
1-4 Sept. 2014
Firstpage :
667
Lastpage :
672
Abstract :
In this paper, we present a method for the automatic segmentation and transcript alignment of documents, for which we only have the transcript at the document level. We consider several line segmentation hypotheses, and recognition hypotheses for each segmented line. The recognition is highly constrained with the document transcript. We formalize the problem in a weighted finite-state transducer framework. We evaluate how the constraints help achieve a reasonable result. In particular, we assess the performance of the system both in terms of segmentation quality and transcript mapping. The main contribution of this paper is that we jointly find the best segmentation and transcript mapping that allow to align the image with the whole ground-truth text. The evaluation is carried out on fully annotated public databases. Furthermore, we retrieved training material with this system for the Maurdor evaluation, where the data was only annotated at the paragraph level. With the automatically segmented and annotated lines, we record a relative improvement in Word Error Rate of 35.6%.
Keywords :
document image processing; finite state machines; handwriting recognition; handwritten character recognition; image segmentation; Maurdor evaluation; automatic line segmentation; ground-truth alignment; handwritten document; segmentation quality; transcript alignment; transcript mapping; weighted finite-state transducer framework; word error rate; Databases; Feature extraction; Hidden Markov models; Image segmentation; Lattices; Optical character recognition software; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location :
Heraklion
ISSN :
2167-6445
Print_ISBN :
978-1-4799-4335-7
Type :
conf
DOI :
10.1109/ICFHR.2014.117
Filename :
6981096
Link To Document :
بازگشت