• DocumentCode
    2013560
  • Title

    Fast and Accurate Detection of Document Skew and Orientation

  • Author

    Shijian Lu ; Jie Wang ; Chew Lim Tan

  • Author_Institution
    Nat. Univ. of Singapore, Singapore
  • Volume
    2
  • fYear
    2007
  • fDate
    23-26 Sept. 2007
  • Firstpage
    684
  • Lastpage
    688
  • Abstract
    This paper presents a document skew and orientation detection technique. The proposed technique estimates document skew and orientation based on the observation that text images normally hold a large amount of equidistant interline spacings and the number of character ascenders is statistically much larger than that of character descenders. Given a document image with arbitrary skew and orientation, white run histograms are first constructed through scanning documents in horizontal and vertical directions. Document skew is then estimated by using the white runs that exactly span the interline spacing. Lastly, document orientation is determined according to the numbers of character ascenders and descenders, which are detected by using the white runs that cross the interline spacing and lie over character ascenders and descenders. Experiments show that the proposed technique is fast, accurate, and capable of detecting arbitrary document skew and orientation.
  • Keywords
    character recognition; document image processing; character ascenders; character descenders; document image; document orientation detection; document skew detection; histograms; text images; Character recognition; Computer science; Histograms; Image analysis; Image retrieval; Labeling; Nearest neighbor searches; Optical character recognition software; Optical distortion; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
  • Conference_Location
    Parana
  • ISSN
    1520-5363
  • Print_ISBN
    978-0-7695-2822-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.2007.4377002
  • Filename
    4377002