• DocumentCode
    2528580
  • Title

    Developing discrete density Hidden Markov Models for Arabic printed text recognition

  • Author

    Awaida, S.M. ; Khorsheed, M.S.

  • Author_Institution
    Comput. Eng. Dept., Qassim Univ., Saudi Arabia
  • fYear
    2012
  • fDate
    12-14 July 2012
  • Firstpage
    35
  • Lastpage
    39
  • Abstract
    In this paper, a technique for the recognition of unconstrained Arabic printed text is proposed. Features that measure the image characteristics at local scales are applied. A line image is divided into a set of one-pixel width windows which is sliding a cross that text line. Run length encoding is used to extract features from each window. A unique method is chosen to select best number of transitions for each window. The proposed recognition system is trained and tested on the APTI (Arabic Printed Text Image) database. In order to select the optimal parameters for feature extraction and for the HMM classifier, the APTI training dataset is further divided into a smaller training subset and a verification set. The estimated parameters are, then, used in the testing phase. The presented technique provides state-of-the-art recognition results on the APTI database using HMMs. The achieved average recognition rates is 96.65% on the letter level using the HMM classifier.
  • Keywords
    document image processing; feature extraction; handwritten character recognition; hidden Markov models; image classification; learning (artificial intelligence); parameter estimation; text analysis; visual databases; APTI database; APTI training dataset; Arabic printed text image database; Arabic printed text recognition; HMM classifier; discrete density hidden Markov model; feature extraction; image characteristic; line image; one-pixel width window; parameter estimation; run length encoding; Databases; Feature extraction; Hidden Markov models; Testing; Text recognition; Training; Vectors; Arabic Optical Printed Text Recognition; Arabic Printed Text; Hidden Markov Model; OCR; Writer Independent Feature Extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Cybernetics (CyberneticsCom), 2012 IEEE International Conference on
  • Conference_Location
    Bali
  • Print_ISBN
    978-1-4673-0891-5
  • Type

    conf

  • DOI
    10.1109/CyberneticsCom.2012.6381612
  • Filename
    6381612