DocumentCode
2771272
Title
Performance of Document Image OCR Systems for Recognizing Video Texts on Embedded Platform
Author
Chattopadhyay, Tanushyam ; Sinha, Priyanka ; Biswas, Provat
Author_Institution
Innovation Labs. Tata Consultancy Services Ltd., Kolkata, India
fYear
2011
fDate
7-9 Oct. 2011
Firstpage
606
Lastpage
610
Abstract
Market demand for an embedded realization of video OCR motivated the authors to exert an attempt to evaluate the performance of existing document image OCR techniques for the same. Thus authors have tried to port the open source OCR systems like GOCR and Tessaract on an embedded platform. But their performance on an embedded platform shows that the character level and word level recognition accuracy is quite unacceptable for video text. This paper compares two such open source OCR systems on Indian TV videos and proposes some techniques that can be used to improve the recognition accuracy from 62% to 93%. Moreover the challenges of porting those codes on an embedded platform is also analyzed in this paper.
Keywords
embedded systems; optical character recognition; text analysis; video signal processing; GOCR; Indian TV video; Tessaract; character level recognition; document image OCR system; document image OCR technique; embedded platform; open source OCR system; video OCR; video text recognition; word level recognition; Accuracy; Character recognition; Engines; Optical character recognition software; Streaming media; Text recognition; ABBYY; Findreader; GOCR; OCR; Tesseract; video;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Communication Networks (CICN), 2011 International Conference on
Conference_Location
Gwalior
Print_ISBN
978-1-4577-2033-8
Type
conf
DOI
10.1109/CICN.2011.131
Filename
6112941
Link To Document