DocumentCode :
2013343
Title :
An Overview of the Tesseract OCR Engine
Author :
Smith, Ray
Author_Institution :
Google Inc., Mountain View
Volume :
2
fYear :
2007
fDate :
23-26 Sept. 2007
Firstpage :
629
Lastpage :
633
Abstract :
The Tesseract OCR engine, as was the HP Research Prototype in the UNLV Fourth Annual Test of OCR Accuracy, is described in a comprehensive overview. Emphasis is placed on aspects that are novel or at least unusual in an OCR engine, including in particular the line finding, features/classification methods, and the adaptive classifier.
Keywords :
image classification; optical character recognition; Tesseract OCR engine; UNLV; adaptive classifier; line finding; Filters; Independent component analysis; Inspection; Open source software; Optical character recognition software; Pipelines; Prototypes; Search engines; Testing; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location :
Parana
ISSN :
1520-5363
Print_ISBN :
978-0-7695-2822-9
Type :
conf
DOI :
10.1109/ICDAR.2007.4376991
Filename :
4376991
Link To Document :
بازگشت