DocumentCode :
2530009
Title :
Document images analysis solutions for digital libraries
Author :
Bourgeois, F. Le ; Trinh, E. ; Allier, B. ; Eglin, V. ; Emptoz, H.
Author_Institution :
LIRIS, CNRS, Villeurbanne, France
fYear :
2004
fDate :
2004
Firstpage :
2
Lastpage :
24
Abstract :
Today the development of digital libraries is reaching technological limits due to the difficulty of automatically processing a growing mass of digitized images of documents from different origins. The main problem is the high cost of the digitization and retro-conversion processes which include image capture and indexation, metadata extraction, image storage, conversion in reusable electronic form, publication on the Internet and reduction of image weights for faster access. To reduce the cost of digitization and retro-conversion, we need to break technological bottlenecks like the development of "intelligent" digitizers which reduce manual intervention and produce the best quality images. Retro-conversion needs efficient software which analyze images contents and automatically extract all necessary information for image indexing. Other technological bottlenecks must also be considered like the need of an open file format, which can describe digitized documents as heterogeneous media. This article is not state-of-the-art in this domain, it just describes some cases, which we have studied in our laboratory during the past years.
Keywords :
digital libraries; document image processing; indexing; information retrieval; Internet; digital library; digitized document; digitized image; document images analysis solution; heterogeneous media; image capture; image indexation; image quality; image storage; image weight; information extraction; metadata extraction; open file format; retro-conversion process; reusable electronic form conversion; Costs; Data mining; Image analysis; Image converters; Image storage; Information analysis; Internet; Manuals; Software libraries; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Image Analysis for Libraries, 2004. Proceedings. First International Workshop on
Print_ISBN :
0-7695-2088-X
Type :
conf
DOI :
10.1109/DIAL.2004.1263233
Filename :
1263233
Link To Document :
بازگشت