DocumentCode
2011334
Title
Towards Semi-supervised Transcription of Handwritten Historical Weather Reports
Author
Richarz, Jan ; Vajda, Szilárd ; Fink, Gernot A.
Author_Institution
Dept. of Comput. Sci., Tech. Univ. Dortmund, Dortmund, Germany
fYear
2012
fDate
27-29 March 2012
Firstpage
180
Lastpage
184
Abstract
This paper addresses the automatic transcription of handwritten documents with a regular tabular structure. A method for extracting machine printed tables from images is proposed, using very little prior knowledge about the document layout. The detected table serves as query for retrieving and fitting a structural template, which is then used to extract handwritten text fields. A semi-supervised learning approach is applied to this fields, aiming at minimizing the human labeling effort for recognizer training. The effectiveness of the proposed approach is demonstrated experimentally on a set of historical weather reports. Compared to using all labels, competitive recognition performance is achieved by labeling only a small fraction of the data, keeping the required human effort very low.
Keywords
feature extraction; geophysics computing; handwritten character recognition; history; image retrieval; learning (artificial intelligence); text analysis; text detection; automatic handwritten document transcription; document layout; handwritten historical weather reports; handwritten text field extraction; human labeling effort minimisation; machine printed table extraction method; query processing; regular tabular structure; semisupervised learning; semisupervised transcription; structural template fitting; structural template retrieval; training recognizer; Handwriting recognition; Humans; Labeling; Meteorology; Principal component analysis; Text analysis; Training; document analysis; handwriting recognition; historical documents; layout analysis; semi-supervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
Conference_Location
Gold Cost, QLD
Print_ISBN
978-1-4673-0868-7
Type
conf
DOI
10.1109/DAS.2012.91
Filename
6195359
Link To Document