DocumentCode :
1840681
Title :
Mining a Multilingual Geographical Gazetteer from the Web
Author :
Popescu, Adrian ; Grefenstette, Gregory ; Bouamor, Houda
Volume :
1
fYear :
2009
fDate :
15-18 Sept. 2009
Firstpage :
58
Lastpage :
65
Abstract :
Geographical gazetteers are necessary in a wide variety of applications. In the past, the construction of such gazetteers has been a tedious, manual process and only recently have the first attempts to automate the gazetteers creation been made. Here we describe our approach for mining accurate but large-scale multilingual geographic information by successively filtering information found in heterogeneous data sources (Flickr, Wikipedia, Panoramio, Web pages indexed by search engines). Statistically cross-checking information found in each site, we are able to identify new geographic objects, and to indicate, for each one, its name, its GPS coordinates, its encompassing regions (city, region, country), the language of the name, its popularity, and the type of the object (church, bridge, etc.). We evaluate our approach by comparing, wherever possible, our multilingual gazetteer to other known attempts at automatically building a geographic database and to Geonames, a manually built gazetteer.
Keywords :
Bridges; Cities and towns; Databases; Global Positioning System; Information filtering; Information filters; Large-scale systems; Search engines; Web pages; Wikipedia; Flickr; Geographical gazetteer; Geonames; Wikipedia; categorization; place names;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Milan, Italy
Print_ISBN :
978-0-7695-3801-3
Electronic_ISBN :
978-1-4244-5331-3
Type :
conf
DOI :
10.1109/WI-IAT.2009.16
Filename :
5284918
Link To Document :
بازگشت