DocumentCode
3062058
Title
Detecting and recognizing numerical strings in Farsi document images
Author
Abedi, Ali ; Faez, Karim ; Mozaffari, Saeed
Author_Institution
Electr. Eng. Dept., Amirkabir Univ. of Technol., Tehran, Iran
fYear
2009
fDate
23-25 Nov. 2009
Firstpage
403
Lastpage
408
Abstract
In this paper, we propose a new approach for detecting and recognizing numerical strings in Farsi/Arabic handwritten or machine-printed document images. We assign a label to each of the connected components as they belong to a numerical string or not. First, in order to differentiate between digit and non-digit connected components, some simple features are extracted from all connected components in each text line. Then, these features are classified with a fuzzy rule-based classifier to extract some candidate strings. After using a digit recognizer, syntax of the numerical strings are validated by a syntactic verifier. Experimental results show an acceptable detection rate with low false positive rate.
Keywords
document image processing; feature extraction; fuzzy set theory; image classification; object detection; string matching; Farsi document images; Farsi-Arabic handwritten; digit recognizer; feature extraction; fuzzy rule-based classifier; machine-printed document images; numerical string detecting; numerical string recognition; Character recognition; Computer vision; Costs; Data mining; Feature extraction; Handwriting recognition; Image converters; Image recognition; Optical character recognition software; Text analysis; Farsi/Arabic document analysis; Feature extraction; Information extraction; Numerical Strings;
fLanguage
English
Publisher
ieee
Conference_Titel
Image and Vision Computing New Zealand, 2009. IVCNZ '09. 24th International Conference
Conference_Location
Wellington
ISSN
2151-2205
Print_ISBN
978-1-4244-4697-1
Electronic_ISBN
2151-2205
Type
conf
DOI
10.1109/IVCNZ.2009.5378373
Filename
5378373
Link To Document