DocumentCode :
1791579
Title :
Geotagging one hundred million Twitter accounts with total variation minimization
Author :
Compton, Ryan ; Jurgens, David ; Allen, David
Author_Institution :
Inf. & Syst. Sci. Lab., HRL Labs., Malibu, CA, USA
fYear :
2014
fDate :
27-30 Oct. 2014
Firstpage :
393
Lastpage :
401
Abstract :
Geographically annotated social media is extremely valuable for modern information retrieval. However, when researchers can only access publicly-visible data, one quickly finds that social media users rarely publish location information. In this work, we provide a method which can geolocate the overwhelming majority of active Twitter users, independent of their location sharing preferences, using only publicly-visible Twitter data. Our method infers an unknown user´s location by examining their friend´s locations. We frame the geotagging problem as an optimization over a social network with a total variation-based objective and provide a scalable and distributed algorithm for its solution. Furthermore, we show how a robust estimate of the geographic dispersion of each user´s ego network can be used as a per-user accuracy measure which is effective at removing outlying errors. Leave-many-out evaluation shows that our method is able to infer location for 101, 846, 236 Twitter users at a median error of 6.38 km, allowing us to geotag over 80% of public tweets.
Keywords :
data mining; distributed algorithms; minimisation; social networking (online); Twitter account geotagging; Twitter user ego network geographic dispersion; distributed algorithm; geographically annotated social media; information retrieval; publicly-visible Twitter data; total variation minimization; Accuracy; Global Positioning System; Media; Minimization; Optimization; Twitter; Data mining; Optimization; Social and Information Networks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2014 IEEE International Conference on
Conference_Location :
Washington, DC
Type :
conf
DOI :
10.1109/BigData.2014.7004256
Filename :
7004256
Link To Document :
بازگشت