Title :
Toponym Recognition in Historical Maps by Gazetteer Alignment
Author_Institution :
Grinnell Coll., Grinnell, IA, USA
Abstract :
Historical map documents are increasingly digitized for widespread access, but most are only coarsely indexed with meta-data while the contents are largely unsearchable. We propose to increase search ability by automatically recognizing the place names in these digitized artifacts. Using a word recognition system that produces a noisy ranked list of initial hypotheses from a lexicon of viable toponyms, we form a joint probabilistic model for inferring the most likely latent alignment between image toponyms and a gazetteer of known place locations. After a robust generalized RANSAC algorithm identifies the global alignment, we rerank the toponym hypotheses by their posterior probability. Experiments demonstrate a significant boost in word recognition accuracy on a manually annotated set of 19th century U.S. state and regional maps.
Keywords :
cartography; database indexing; document image processing; meta data; probability; visual databases; word processing; US state maps; automatic place name recognition; digitized artifacts; gazetteer alignment; historical map documents; image toponyms; joint probabilistic model; latent alignment; metadata; posterior probability; regional maps; robust generalized RANSAC algorithm; toponym hypotheses; toponym lexicon; toponym recognition; word recognition; word recognition system; Graphics; Image recognition; Optical character recognition software; Probability; Robustness; Standards; Text recognition; MLESAC; correspondence; gazetteer; generalized RANSAC; georectification; historical maps; toponym recognition;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/ICDAR.2013.209