Title :
A segmentation-free approach for printed Devanagari script recognition
Author :
Tushar Karayil;Adnan Ul-Hasan;Thomas M. Breuel
Author_Institution :
Department of Computer Science, University of Kaiserslautern, Germany
Abstract :
Long Short-Term Memory (LSTM) networks are a suitable candidate for segmentation-free Optical Character Recognition (OCR) tasks due to their good context-aware processing. In this paper, we report the results of applying LSTM networks to Devanagari script, where each consonant-consonant conjuncts and consonant-vowel combinations take different forms based on their position in the word. We also introduce a new database, Deva-DB, of Devanagari script (free of cost) to aid the research towards a robust Devanagari OCR system. On this database, LSTM-based OCRopus system yields error rates ranging from 1.2% to 9.0% depending upon the complexity of the training and test data. Comparison with open-source Tesseract system is also presented for the same database.
Keywords :
"Optical imaging","Robustness","Proteins","Periodic structures","Computer aided software engineering","Adaptive optics","Integrated optics"
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2015 13th International Conference on
DOI :
10.1109/ICDAR.2015.7333901