Title :
A Cache Language Model for Whole Document Handwriting Recognition
Author :
Frinken, Volkmar ; Karatzas, Dimosthenis ; Fischer, Anath
Author_Institution :
Fac. of Inf. Sci. & Electr. Eng., Kyushu Univ., Fukuoka, Japan
Abstract :
With increasing computational power, the trend in unconstrained text recognition is going towards whole document processing. For this task, more sophisticated language models can be employed. One approach is to take advantage the fact that the text of a document normally deals with a specific topic and hence the word occurrence probability is biased. Cache language models combine information about recent words, the cache, with a general statistical language model to increase the recognition rate. In this work we introduce a modified version of the cache language model to the task of handwriting recognition, where the N-best recognition output of the entire document is used to refine the language model for a consecutive recognition pass. An experimental evaluation on the IAM database demonstrates that the word error rate can be reduced with the proposed cache language model.
Keywords :
computational linguistics; document image processing; handwriting recognition; probability; IAM database; N-best recognition; cache language model; document handwriting recognition; statistical language model; text recognition; word error rate; word occurrence probability; Adaptation models; Computational modeling; Databases; Handwriting recognition; Hidden Markov models; Speech recognition; Text recognition; Handwriting Recognition; Language Model; Whole Document;
Conference_Titel :
Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on
Conference_Location :
Tours
Print_ISBN :
978-1-4799-3243-6
DOI :
10.1109/DAS.2014.56