DocumentCode
2148156
Title
Greek Polytonic OCR Based on Efficient Character Class Number Reduction
Author
Gatos, B. ; Louloudis, G. ; Stamatopoulos, N.
Author_Institution
Comput. Intell. Lab., Nat. Res. Center Demokritos, Athens, Greece
fYear
2011
fDate
18-21 Sept. 2011
Firstpage
1155
Lastpage
1159
Abstract
Recognition of document images having Greek polytonic (multi accent) characters is a challenging task due the large number of existing character classes (more than 270). In this paper, we propose a novel OCR framework for the recognition of machine-printed Greek polytonic documents that is based on combining five different recognition modules in order to have a small number of classes (around 30) in each module. One recognition module is used for accent recognition while four recognition modules are used for the recognition of characters belonging to different horizontal text zones. The proposed system also includes the following stages: (a) pre-processing, (b) text dewarping, text line and text baseline detection, (c) accent and character detection and (d) combination of accent and character recognition results. Extended experiments have been conducted in order to record the performance of the proposed OCR system, of all involved recognition modules as well as of the accent detection stage.
Keywords
document image processing; optical character recognition; text analysis; Greek polytonic OCR system; character class number reduction; character detection; character recognition; horizontal text zone; machine printed Greek polytonic document image recognition module; text baseline detection; text dewarping; Accuracy; Character recognition; Image segmentation; Measurement; Optical character recognition software; Text recognition; Class number reduction; Greek polytonic characters; OCR; Word baseline detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition (ICDAR), 2011 International Conference on
Conference_Location
Beijing
ISSN
1520-5363
Print_ISBN
978-1-4577-1350-7
Electronic_ISBN
1520-5363
Type
conf
DOI
10.1109/ICDAR.2011.233
Filename
6065491
Link To Document