• DocumentCode
    3140514
  • Title

    Interactive approach to the extraction of logical structures from unformatted document images using a sub-structure model

  • Author

    Yamaoka, Masaki ; Iwaki, Osamu ; Babaguchi, Noboru ; Kitahashi, Tadahiro

  • Author_Institution
    Inst. of Sci. & Ind. Res., Osaka Univ., Japan
  • fYear
    1999
  • fDate
    20-22 Sep 1999
  • Firstpage
    185
  • Lastpage
    188
  • Abstract
    Describes a new document analysis method for unformatted documents such as advertisements or catalogs. Conventional model-based approaches to the extraction of logical structures are hard to apply to advertisements or catalogs, because a model of a page can´t be defined. However, these kinds of documents have similar configurations of the regions that represent each product, where a local model of a local layout and logical structures can be defined. This model, which we call a sub-structure model, can be used as a template to extract the logical structures from other regions that represent the same kinds of products. In proposed system, a sub-structure model is captured through an interactive process with a user. The system was tested on advertisements in Japanese computer magazines and the experiments show promising results
  • Keywords
    advertising data processing; cataloguing; document image processing; image segmentation; Japanese computer magazines; advertisements; catalogs; document analysis method; document region configurations; interactive approach; local layout; local model; logical structure extraction; product representation; sub-structure model; template; unformatted document images; Application software; Catalogs; Character recognition; Data mining; Image analysis; Information retrieval; Information systems; Research and development; System testing; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    0-7695-0318-7
  • Type

    conf

  • DOI
    10.1109/ICDAR.1999.791755
  • Filename
    791755