• DocumentCode
    2197080
  • Title

    A Novel Arabic Baseline Estimation Algorithm Based on Sub-Words Treatment

  • Author

    Boukerma, Hanene ; Farah, Nadir

  • Author_Institution
    Lab. de Gestion Electron. du Document (LABGED), Univ. 20 Aout 1955, Skikda, Algeria
  • fYear
    2010
  • fDate
    16-18 Nov. 2010
  • Firstpage
    335
  • Lastpage
    338
  • Abstract
    Baseline detection is an essential preprocessing step for many OCR systems, it has a direct effect on the efficiency and reliability of characters segmentation and features extraction stages, which contribute strongly to yielding higher recognition accuracy. For Arabic handwritten, the conventional methods which extract baseline as straight line are ill-suited because some Arabic words may be contracted from two or more sub-words (PAWs), and the distribution of these sub-words can produce different slant angles within the same word. Focused on the source of the problem, we propose a novel Arabic baseline estimation algorithm in which the PAW level is the real basic block to be processed rather than word level. Experimental results using IFN/ENIT [1] database demonstrate the efficiency of the proposed algorithm.
  • Keywords
    edge detection; feature extraction; handwritten character recognition; image segmentation; natural languages; optical character recognition; word processing; Arabic handwritten character recognition; OCR system; PAW level; arabic baseline estimation algorithm; baseline detection; character segmentation reliability; feature extraction; optical character recognition; subword treatment; Arabic handwritten; baseline detection; preprocessing; sub-word extraction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on
  • Conference_Location
    Kolkata
  • Print_ISBN
    978-1-4244-8353-2
  • Type

    conf

  • DOI
    10.1109/ICFHR.2010.58
  • Filename
    5693545