Title :
Detecting future social unrest in unprocessed Twitter data: “Emerging phenomena and big data”
Author :
Compton, Ryan ; Lee, Chi-Kwan ; Tsai-Ching Lu ; de Silva, Lakdeepal ; Macy, Michael
Author_Institution :
HRL Labs., Malibu, CA, USA
Abstract :
We have implemented a social media data mining system capable of forecasting events related to Latin American social unrest. Our method directly extracts a small number of tweets from publicly-available data on twitter.com, condenses similar tweets into coherent forecasts, and assembles a detailed and easily-interpretable audit trail which allows end users to quickly collect information about an upcoming event. Our system functions by continually applying multiple textual and geographic filters to a large volume of data streaming from twitter.com via the public API as well as a commercial data feed. To be specific, we search the entirety of twitter.com for a few carefully chosen keywords, search within those tweets for mentions of future dates, filter again using various logistic regression classifiers, and finally assign a location to an event by geocoding retweeters. Geocoding is done using our previously-developed in-house geocoding service which, at the time of this writing, can infer the home location for over 62M twitter.com users [1]. Additionally, we identify demographics likely interested in an upcoming event by searching retweeter´s recent posts for demographic-specific keywords.
Keywords :
application program interfaces; data mining; demography; information filters; pattern classification; regression analysis; social networking (online); social sciences computing; Latin American social unrest; audit trail; commercial data feed; data streaming; demographic-specific keywords; events forecasting; future social unrest detection; geocoding retweeters; geographic filters; in-house geocoding service; information collection; logistic regression classifiers; multiple textual filters; public API; publicly-available data; social media data mining system; twitter.com; unprocessed Twitter data; Cities and towns; Feeds; Government; Logistics; Media; Writing;
Conference_Titel :
Intelligence and Security Informatics (ISI), 2013 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
978-1-4673-6214-6
DOI :
10.1109/ISI.2013.6578786