DocumentCode
183467
Title
ICFHR2014 Competition on Handwritten Text Recognition on Transcriptorium Datasets (HTRtS)
Author
Andreu Sanchez, Joan ; Romero, Veronica ; Toselli, Alejandro Hector ; Vidal, Enrique
Author_Institution
Pattern Recognition & Human Language Technol. Res. Center, Univ. Politec. de Valencia, Valencia, Spain
fYear
2014
fDate
1-4 Sept. 2014
Firstpage
785
Lastpage
790
Abstract
A contest on Handwritten Text Recognition organised in the context of the ICFHR 2014 conference is described. Two tracks with increased freedom on the use of training data were proposed and three research groups participated in these two tracks. The handwritten images for this contest were drawn from an English data set which is currently being considered in the Tran scriptorium project. The goal of this project is to develop innovative, efficient and cost-effective solutions for the transcription of historical handwritten document images, focusing on four languages: English, Spanish, German and Dutch. For the English language, the so-called "Bentham collection" is being considered in Tran scriptorium. It encompasses a large set of manuscripts written by the renowned English philosopher and reformer Jeremy Bentham (1748-1832). A small subset of this collection has been chosen for the present HTR competition. The selected subset has been written by several hands (Bentham himself and his secretaries) and entails significant variabilities and difficulties regarding the quality of text images and writing styles. Training and test data were provided in the form of carefully segmented line images, along with the corresponding transcripts. The three participants achieved very good results, with transcription word error rates ranging from 15.0% down to 8.6%.
Keywords
document image processing; handwritten character recognition; image segmentation; natural language processing; text analysis; Bentham collection; Dutch language; English language; German language; HTR competition; HTRtS; ICFHR2014 competition; Spanish language; handwritten text recognition; historical handwritten document image transcription; segmented line images; tranScriptorium datasets; Adaptive optics; Artificial neural networks; Hidden Markov models; Histograms; Text recognition; Training; Training data; Handwritten Text Recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location
Heraklion
ISSN
2167-6445
Print_ISBN
978-1-4799-4335-7
Type
conf
DOI
10.1109/ICFHR.2014.137
Filename
6981116
Link To Document