Title :
"Hey #311, Come Clean My Street!": A Spatio-temporal Sentiment Analysis of Twitter Data and 311 Civil Complaints
Author :
Eshleman, Ryan ; Hui Yang
Author_Institution :
Dept. of Comput. Sci., San Francisco State Univ., San Francisco, CA, USA
Abstract :
Twitter data has been applied to address a wide range of applications (e.g., Political election prediction and disease tracking), however, no studies have been conducted to explore the interactions and potential relationships between twitter data and social events available from government entities. In this paper, we introduce a novel approach to investigate the spatio-temporal relationships between the sentiment aspects of tweets and 311 civil complaints recorded in the 311 Case Database, which is freely available from the City of San Francisco. We also present results from two supporting tasks: (1) We apply sentiment analysis techniques to model the emotional characteristics of five metropolitan areas around the globe, allowing one to gain insight into the relative happiness across cities and neighborhoods within a city, and (2) we quantify the performance of several open-source machine learning algorithms for sentiment analysis by applying them to large volume of twitter data, thereby providing empirical guidelines for practitioners. Major contributions and findings include (1) We have developed a system for the relative ranking of happiness of a geographical area. Our results show that Sydney, Australia is the happiest of the five cities under study, (2) We have found a counterintuitive positive correlation between 311-report frequency and local sentiment, and (3) When performing sentiment analysis of tweets, the inclusion of emoticons in the training dataset can lead to model over fitting, whereas NLP-based features seem to have a great potential to improve the classification accuracy.
Keywords :
emotion recognition; learning (artificial intelligence); natural language processing; public domain software; social networking (online); 311 case database; 311-report frequency; Australia; NLP- based features; San Francisco City; Sydney; Twitter data; civil complaints; disease tracking; emotional characteristics; government entities; open-source machine learning algorithms; political election prediction; sentiment analysis techniques; sentiment aspects; social events; spatio-temporal sentiment analysis; tweet sentiment analysis; twitter data; Accuracy; Cities and towns; Indexes; Machine learning algorithms; Sentiment analysis; Twitter; 311 civil complaints; happiness index; online social networks; sentiment analysis; spatio-temporal analysis; twitter data;
Conference_Titel :
Big Data and Cloud Computing (BdCloud), 2014 IEEE Fourth International Conference on
Conference_Location :
Sydney, NSW
DOI :
10.1109/BDCloud.2014.106