• DocumentCode
    1781786
  • Title

    Efficient segmentation of sub-words within handwritten arabic words

  • Author

    Khan, Faraz ; Bouridane, Ahmed ; Khelifi, Fouad ; Almotaeryi, Rasheed ; Almaadeed, Sumaya

  • Author_Institution
    Sch. of Comput., Eng. & Inf. Sci., Northumbria Univ., Newcastle upon Tyne, UK
  • fYear
    2014
  • fDate
    3-5 Nov. 2014
  • Firstpage
    684
  • Lastpage
    689
  • Abstract
    Segmentation is considered as a core step for any recognition or classification method and for the text within any document to be effectively recognized it must be segmented accurately. In this paper a text and writer independent algorithm for the segmentation of sub-words in Arabic words has been presented. The concept is based around the global binarization of an image at various thresholding levels. When each sub-word or Part of Arabic Word (PAW) within the image being investigated is processed at multiple threshold levels a cluster graph is obtained where each cluster represents the individual sub-words of that word. Once the clusters are obtained the task of segmentation is managed by simply selecting the respective cluster automatically which is achieved using the 95% confidence interval on the processed data generated by the accumulated graph. The presented algorithm was tested on 537 randomly selected words from the AHTID/MW database and the results showed that 95.3% of the sub-words or PAW were correctly segmented and extracted. The proposed method has shown considerable improvement over the projection profile method which is commonly used to segment sub-words or PAW.
  • Keywords
    handwritten character recognition; image segmentation; text analysis; word processing; AHTID/MW database; PAW; Part of Arabic Word; cluster graph; document; global binarization; handwritten Arabic words; image thresholding; sub-word segmentation; text classification; text recognition; writer independent algorithm; Clustering algorithms; Databases; Equations; Image segmentation; Mathematical model; Noise; Writing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control, Decision and Information Technologies (CoDIT), 2014 International Conference on
  • Conference_Location
    Metz
  • Type

    conf

  • DOI
    10.1109/CoDIT.2014.6996979
  • Filename
    6996979