• DocumentCode
    3007548
  • Title

    A Content-Based Retrieval Algorithm for Document Image Database

  • Author

    Hou, Dewen ; Wang, Xichang ; Liu, Jiang

  • Author_Institution
    Key Lab. for Distrib. Comput. Software, Shandong Normal Univ., Jinan, China
  • fYear
    2010
  • fDate
    29-31 Oct. 2010
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    This paper makes a study on content-based image retrieval algorithm for document image database. Given a query image the system returns overall similar images in database. For document images, we propose the algorithm based on hierarchical matching tree. First segment an image into several regions with paragraph marking based on paragraph height estimation, and then segment the region into line blocks, the algorithm for document image retrieval by regions and line blocks with hierarchical matching tree is presented. Also we describe the matching model and the texture character strings for indexing. This algorithm is tested through trials. The experiment results indicate this algorithm is accuracy and effective. The response time of retrieval is strongly reduced by image scaling. The efficiency of retrieval is highly valuable in document image database.
  • Keywords
    content-based retrieval; document image processing; image matching; image retrieval; image segmentation; visual databases; content-based image retrieval algorithm; document image database; hierarchical matching tree; image segmentation; matching model; paragraph height estimation; paragraph marking; texture character strings; Algorithm design and analysis; Feature extraction; Image retrieval; Image segmentation; Semantics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Technology (ICMT), 2010 International Conference on
  • Conference_Location
    Ningbo
  • Print_ISBN
    978-1-4244-7871-2
  • Type

    conf

  • DOI
    10.1109/ICMULT.2010.5631277
  • Filename
    5631277