• DocumentCode
    2323292
  • Title

    A novel segmentation technique for splitting a typed Persian text to sub-words

  • Author

    Shafii, Mahnaz ; Sid-Ahmed, Maher A. ; Ahmadi, Majid

  • Author_Institution
    Univ. of Windsor, Windsor, ON, Canada
  • fYear
    2012
  • fDate
    2-4 May 2012
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    One common approach in recognition of a Persian text is to segment the text into its component words and then words to single characters. Due to specific characteristics of Persian words, this approach is non-trivial and literature reports relatively low success rate for Persian character segmentation. As an alternative, we segment a Persian text only to its component sub-words; and then, recognize sub-words from a large library of sub-words. In this document, we describe a novel segmentation technique to split a Persian text to its sub-words components with a perfect success rate for the texts and fonts tested.
  • Keywords
    character recognition; image recognition; image segmentation; Persian character segmentation; segmentation technique; sub-words; text recognition; typed Persian text; Algorithm design and analysis; Character recognition; Image segmentation; Labeling; Optical character recognition software; Sorting; Text recognition; OCR; Persian Text Recognition; Sub-words Segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications Control and Signal Processing (ISCCSP), 2012 5th International Symposium on
  • Conference_Location
    Rome
  • Print_ISBN
    978-1-4673-0274-6
  • Type

    conf

  • DOI
    10.1109/ISCCSP.2012.6217760
  • Filename
    6217760