DocumentCode :
153323
Title :
A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents
Author :
Fischer, Anath ; Baechler, Micheal ; Garz, Angelika ; Liwicki, Marcus ; Ingold, Rolf
Author_Institution :
Dept. of Electr. Eng., Polytech. Montreal, Montreal, QC, Canada
fYear :
2014
fDate :
7-10 April 2014
Firstpage :
71
Lastpage :
75
Abstract :
Automated reading of historical handwriting is needed to search and browse ancient manuscripts in digital libraries based on their textual content. In this paper, we present a combined system for text localization and transcription in page images. It includes flexible learning-based methods for layout analysis and handwriting recognition, which were developed in the context of the Swiss research project HisDoc. A comprehensive experimental evaluation is provided for the medieval Parzival database, demonstrating a promising word recognition accuracy of 93.0% with closed vocabulary. In order to harmonize the evaluation of the two document analysis tasks, we introduce a novel evaluation measure for text line extraction that takes substitution, deletion, as well as insertion errors into account.
Keywords :
digital libraries; document image processing; feature extraction; handwriting recognition; Swiss research project HisDoc; ancient manuscript; automated reading; digital libraries; document analysis task; flexible learning-based method; handwriting recognition; historical document; layout analysis; medieval Parzival database; text line extraction; text localization; transcription; word recognition; Accuracy; Databases; Handwriting recognition; Hidden Markov models; Layout; Text analysis; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on
Conference_Location :
Tours
Print_ISBN :
978-1-4799-3243-6
Type :
conf
DOI :
10.1109/DAS.2014.51
Filename :
6830972
Link To Document :
بازگشت