• DocumentCode
    2503887
  • Title

    Unsupervised Block Covering Analysis for Text-Line Segmentation of Arabic Ancient Handwritten Document Images

  • Author

    Boussellaa, Wafa ; Zahour, Abderrazak ; Elabed, Haikal ; Benabdelhafid, Abdelatif ; Alimi, Adel

  • Author_Institution
    Res. Group on Intell. Machines, Univ. of Sfax, Sfax, Tunisia
  • fYear
    2010
  • fDate
    23-26 Aug. 2010
  • Firstpage
    1929
  • Lastpage
    1932
  • Abstract
    This paper presents a new method for automatic text-line extraction from Arabic historical handwritten documents presenting an overlapping and multi-touching characters problems. Our approach is based on block covering analysis using unsupervised technique. This algorithm performs firstly a statistical block analysis which computes the optimal number of document decomposition into vertical strips. Then, our algorithm achieves a fuzzy base line detection using fuzzy C-means algorithm. Finally, blocks are assigned to its corresponding lines. Experiment results show that the proposed method achieves high accuracy about 95% for detecting text lines in Arabic historical handwritten document images written with different scripts.
  • Keywords
    document image processing; handwriting recognition; image segmentation; natural languages; statistical analysis; text analysis; Arabic ancient handwritten document images; automatic text-line extraction; fuzzy C-means algorithm; fuzzy base line detection; multitouching characters problems; statistical block analysis; text-line segmentation; unsupervised block covering analysis; Accuracy; Algorithm design and analysis; Clustering algorithms; Image segmentation; Partitioning algorithms; Pixel; Strips; Arabic historical document; Block covering Analysis; Fuzzy C-means; Fuzzy base line detection; Text-line segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition (ICPR), 2010 20th International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1051-4651
  • Print_ISBN
    978-1-4244-7542-1
  • Type

    conf

  • DOI
    10.1109/ICPR.2010.475
  • Filename
    5597247