Title :
HistoSketch: A Semi-Automatic Annotation Tool for Archival Documents
Author :
Mas, Joan ; Rodríguez, José A. ; Karatzas, Dimosthenis ; Sánchez, Gemma ; Lladós, Josep
Author_Institution :
Comput. Sci. Dept., Univ. Autonoma de Barcelona, Barcelona
Abstract :
This article describes a sketch-based framework for semi-automatic annotation of historical document collections. It is motivated by the fact that fully automatic methods, while helpful for extracting metadata from large collections, have two main drawbacks in a real-world application: (i) they are error-prone and (ii) they only capture a subset of all the knowledge in the document base, both meaning that manual intervention is always required. Therefore, we have developed a practical framework for allowing experts to extract knowledge from document collections in a sketch-based scenario. The main possibilities of the proposed framework are: (a) browsing the collection efficiently, (b) providing gestures for metadata input, (c) supporting handwritten notes and (d) providing gestures for launching automatic extraction processes such as OCR or word spotting.
Keywords :
digital libraries; document image processing; handwriting recognition; information retrieval systems; knowledge acquisition; meta data; text analysis; archival document; digital library; handwritten recognition; knowledge extraction; metadata extraction; semiautomatic annotation tool; sketch-based framework; Computer science; Computer vision; Data mining; Focusing; Government; Optical character recognition software; Software libraries; Testing; Text analysis; User interfaces; Adjacency Grammars; Annotation tool; Graphics Recognition; Historical Documents; SVN; Word Spotting;
Conference_Titel :
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location :
Nara
Print_ISBN :
978-0-7695-3337-7
DOI :
10.1109/DAS.2008.70