• DocumentCode
    3062058
  • Title

    Detecting and recognizing numerical strings in Farsi document images

  • Author

    Abedi, Ali ; Faez, Karim ; Mozaffari, Saeed

  • Author_Institution
    Electr. Eng. Dept., Amirkabir Univ. of Technol., Tehran, Iran
  • fYear
    2009
  • fDate
    23-25 Nov. 2009
  • Firstpage
    403
  • Lastpage
    408
  • Abstract
    In this paper, we propose a new approach for detecting and recognizing numerical strings in Farsi/Arabic handwritten or machine-printed document images. We assign a label to each of the connected components as they belong to a numerical string or not. First, in order to differentiate between digit and non-digit connected components, some simple features are extracted from all connected components in each text line. Then, these features are classified with a fuzzy rule-based classifier to extract some candidate strings. After using a digit recognizer, syntax of the numerical strings are validated by a syntactic verifier. Experimental results show an acceptable detection rate with low false positive rate.
  • Keywords
    document image processing; feature extraction; fuzzy set theory; image classification; object detection; string matching; Farsi document images; Farsi-Arabic handwritten; digit recognizer; feature extraction; fuzzy rule-based classifier; machine-printed document images; numerical string detecting; numerical string recognition; Character recognition; Computer vision; Costs; Data mining; Feature extraction; Handwriting recognition; Image converters; Image recognition; Optical character recognition software; Text analysis; Farsi/Arabic document analysis; Feature extraction; Information extraction; Numerical Strings;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image and Vision Computing New Zealand, 2009. IVCNZ '09. 24th International Conference
  • Conference_Location
    Wellington
  • ISSN
    2151-2205
  • Print_ISBN
    978-1-4244-4697-1
  • Electronic_ISBN
    2151-2205
  • Type

    conf

  • DOI
    10.1109/IVCNZ.2009.5378373
  • Filename
    5378373