DocumentCode :
2011334
Title :
Towards Semi-supervised Transcription of Handwritten Historical Weather Reports
Author :
Richarz, Jan ; Vajda, Szilárd ; Fink, Gernot A.
Author_Institution :
Dept. of Comput. Sci., Tech. Univ. Dortmund, Dortmund, Germany
fYear :
2012
fDate :
27-29 March 2012
Firstpage :
180
Lastpage :
184
Abstract :
This paper addresses the automatic transcription of handwritten documents with a regular tabular structure. A method for extracting machine printed tables from images is proposed, using very little prior knowledge about the document layout. The detected table serves as query for retrieving and fitting a structural template, which is then used to extract handwritten text fields. A semi-supervised learning approach is applied to this fields, aiming at minimizing the human labeling effort for recognizer training. The effectiveness of the proposed approach is demonstrated experimentally on a set of historical weather reports. Compared to using all labels, competitive recognition performance is achieved by labeling only a small fraction of the data, keeping the required human effort very low.
Keywords :
feature extraction; geophysics computing; handwritten character recognition; history; image retrieval; learning (artificial intelligence); text analysis; text detection; automatic handwritten document transcription; document layout; handwritten historical weather reports; handwritten text field extraction; human labeling effort minimisation; machine printed table extraction method; query processing; regular tabular structure; semisupervised learning; semisupervised transcription; structural template fitting; structural template retrieval; training recognizer; Handwriting recognition; Humans; Labeling; Meteorology; Principal component analysis; Text analysis; Training; document analysis; handwriting recognition; historical documents; layout analysis; semi-supervised learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
Conference_Location :
Gold Cost, QLD
Print_ISBN :
978-1-4673-0868-7
Type :
conf
DOI :
10.1109/DAS.2012.91
Filename :
6195359
Link To Document :
بازگشت