• DocumentCode
    397301
  • Title

    User-assisted archive document image analysis for digital library construction

  • Author

    He, J. ; Downton, A.C.

  • Author_Institution
    Dept. of Electron. Syst. Eng., Essex Univ., Colchester, UK
  • fYear
    2003
  • fDate
    3-6 Aug. 2003
  • Firstpage
    498
  • Abstract
    A configurable archive document image analysis system for digital library construction has been designed using rapid prototyping and top-down iterative development methods. This approach has been found to be essential in order to capture the curators´ expertise about existing card archive structures, content and databases. The design currently achieves about 93% correct segmentation of the required archive card fields overall, with 81.3% of all archive cards in a testset of 2000 images having all fields correctly segmented and labeled. Analysis of errors in the testset indicates that heavily-annotated cards and non-standard card formats comprise 5-10% of the overall archive, and a significant proportion of these are unlikely to be resolvable without curatorial intervention.
  • Keywords
    digital libraries; document image processing; graphical user interfaces; image segmentation; archive database; archive document image analysis; card archive content; card archive structure; configurable document image analysis; digital library construction; rapid prototyping; top-down iterative development method; user-assisted document image analysis; Image analysis; Image converters; Image databases; Image segmentation; Laboratories; Multimedia systems; Optical character recognition software; Software libraries; Testing; Text analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
  • Print_ISBN
    0-7695-1960-1
  • Type

    conf

  • DOI
    10.1109/ICDAR.2003.1227715
  • Filename
    1227715