Title :
XML Data Representation in Document Image Analysis
Author :
Belaïd, Abdel ; Falk, Ingrid ; Rangoni, Yves
Author_Institution :
Univ. Nancy 2, Vandoevre-les-Nancy
Abstract :
This paper presents the XML-based formats ALTO, TEI, METS used for digital libraries and their interest for data representation in a document image analysis and recognition (DIAR) process. In the first part we briefly present these formats with focus on their adequacy for structural representation and modeling of DIAR data. The second part shows how these formats can be used in a reverse engineering process. Their implementation as a data representation framework will be shown.
Keywords :
XML; document image processing; image recognition; image representation; ALTO; METS; TEI; XML data representation; XML-based formats; digital libraries; document image analysis; document image recognition; structural modeling; structural representation; Encoding; Guidelines; Image analysis; Image recognition; Optical character recognition software; Reverse engineering; Software libraries; Text analysis; Text recognition; XML;
Conference_Titel :
Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
Conference_Location :
Parana
Print_ISBN :
978-0-7695-2822-9
DOI :
10.1109/ICDAR.2007.4378679