• DocumentCode
    1632942
  • Title

    Page Rule-Line Removal Using Linear Subspaces in Monochromatic Handwritten Arabic Documents

  • Author

    Abd-Almageed, Wael ; Kumar, Jayant ; Doermann, David

  • Author_Institution
    Language & Media Process. Lab., Univ. of Maryland at Coll. Park, College Park, MD, USA
  • fYear
    2009
  • Firstpage
    768
  • Lastpage
    772
  • Abstract
    In this paper we present a novel method for removing page rule lines in monochromatic handwritten Arabic documents using subspace methods with minimal effect on the quality of the foreground text. We use moment and histogram properties to extract features that represent the characteristics of the underlying rule lines. A linear subspace is incrementally built to obtain a line model that can be used to identify rule line pixels. We also introduce a novel scheme for evaluating noise removal algorithms in general and we use it to assess the quality of our rule line removal algorithm. Experimental results presented on a data set of 50 Arabic documents, handwritten by different writers, demonstrate the effectiveness of the proposed method.
  • Keywords
    document image processing; feature extraction; handwritten character recognition; image denoising; method of moments; natural languages; statistical analysis; text analysis; feature extraction; foreground text quality; histogram property; linear subspace method; moment property; monochromatic handwritten Arabic document; noise removal algorithm; page rule-line removal; Educational institutions; Feature extraction; Handwriting recognition; Hidden Markov models; Laboratories; Optical character recognition software; Optical noise; Testing; Text analysis; Text recognition; Rule line; data description; subspace;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1520-5363
  • Print_ISBN
    978-1-4244-4500-4
  • Electronic_ISBN
    1520-5363
  • Type

    conf

  • DOI
    10.1109/ICDAR.2009.276
  • Filename
    5277504