DocumentCode
1992643
Title
Digital libraries and document image analysis
Author
Baird, Henry S.
Author_Institution
Palo Alto Res. Center, CA, USA
fYear
2003
fDate
3-6 Aug. 2003
Firstpage
2
Abstract
The rapid growth of digital libraries (DLs) worldwide poses many new challenges for document image analysis (DIA) research and development. DLs promise to offer more people access to larger document collections, and at far greater speed, than physical libraries can. But DLs also tend, for many reasons, to serve poorly, or even to omit entirely, many types of non-digital human-legible media, such as originally printed and handwritten documents. These media, in their original physical (undigitized) form, are readily - if not always quickly - legible, searchable, and browseable, whereas in the form of document images accessed through DLs they often lose many of their original advantages while of course lacking many advantages of symbolically encoded information. The author explores these issues and illustrates them with brief case studies arising from his experience as a DIA researcher in collaboration with several DL projects in the US. Difficult open DIA technical problems in DL applications are identified in the contrasting advantages of paper and digital displays, at every stage of capture, early processing, recognition, analysis, presentation, retrieval, and in personal and interactive applications. These support the conclusion that the international DIA R & D community is urgently needed (because uniquely qualified) to provide new technology to help rescue from neglect - even, in many cases, eventual oblivion - the world´s vast culturally irreplaceable legacy paper document collections.
Keywords
digital libraries; document image processing; image coding; image retrieval; digital displays; digital libraries; document image analysis; document image capture; handwritten documents; nondigital human-legible media; symbolically encoded information; Digital images; Displays; Electron traps; Image analysis; Online Communities/Technical Collaboration; Paper technology; Research and development; Software libraries; Text analysis; XML;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
Print_ISBN
0-7695-1960-1
Type
conf
DOI
10.1109/ICDAR.2003.1227619
Filename
1227619
Link To Document