• DocumentCode
    3748877
  • Title

    Extraction of Virtual Baselines from Distorted Document Images Using Curvilinear Projection

  • Author

    Gaofeng Meng;Zuming Huang;Yonghong Song;Shiming Xiang;Chunhong Pan

  • Author_Institution
    Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
  • fYear
    2015
  • Firstpage
    3925
  • Lastpage
    3933
  • Abstract
    The baselines of a document page are a set of virtual horizontal and parallel lines, to which the printed contents of document, e.g., text lines, tables or inserted photos, are aligned. Accurate baseline extraction is of great importance in the geometric correction of curved document images. In this paper, we propose an efficient method for accurate extraction of these virtual visual cues from a curved document image. Our method comes from two basic observations that the baselines of documents do not intersect with each other and that within a narrow strip, the baselines can be well approximated by linear segments. Based upon these observations, we propose a curvilinear projection based method and model the estimation of curved baselines as a constrained sequential optimization problem. A dynamic programming algorithm is then developed to efficiently solve the problem. The proposed method can extract the complete baselines through each pixel of document images in a high accuracy. It is also scripts insensitive and highly robust to image noises, non-textual objects, image resolutions and image quality degradation like blurring and non-uniform illumination. Extensive experiments on a number of captured document images demonstrate the effectiveness of the proposed method.
  • Keywords
    "Strips","Image segmentation","Radon","Transforms","Optimization","Layout","Robustness"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision (ICCV), 2015 IEEE International Conference on
  • Electronic_ISBN
    2380-7504
  • Type

    conf

  • DOI
    10.1109/ICCV.2015.447
  • Filename
    7410804