Author_Institution :
Dept. of Inf. & Telecommun., Univ. of Athens, Athens, Greece
Abstract :
Location is prevalent in most applications nowadays, and is considered a first class citizen in social networks. Locational information is of great significance since it can be used to map information from the online back to the physical world, to contextualize information, or to provide localized recommendations through Location-Based Services (LBS), and can be extended to trajectories and itineraries by adding a time-dimension. Despite all that, location extraction in social networks usually relies on GPS-enabled devices, that provide accurate geodetic coordinates. Nevertheless, most users provide general textual information about their surroundings, such as the city they live in, county or state (or equivalents), without using a GPS-enabled device, which is still valuable information for a number of applications. In this paper, we tackle the problem of extracting location information, usually referred to as geocoding, from additional user-provided content. Instead of using sophisticated and complex algorithms, which are common in online map services but require heavy development and tuning, we rely on software and data which are available online and public ally accessible. We discuss the particularities of geocoding in online social networks and present a simple, lightweight, yet efficient approach for location extraction in such a setting. We finally evaluate our approach experimentally on a large corpus of Twitter users.
Keywords :
Global Positioning System; geodesy; information retrieval; mobile computing; social networking (online); text analysis; GPS-enabled devices; Twitter user corpus; commodity software; geocoding; geodetic coordinates; information mapping; localized recommendations; location information extraction; location-based services; locational information; online data; online map services; online social networks; textual information; user-provided content; Cities and towns; Cleaning; Databases; Global Positioning System; Software; Twitter; commodity software; location extraction; social networks;