DocumentCode :
658337
Title :
Towards an Expressive and Scalable Twitter´s Users Profiles
Author :
Subercaze, Julien ; Gravier, Christophe ; Laforest, Frederique
Author_Institution :
Telecom St.-Etienne, St.-Etienne, France
Volume :
1
fYear :
2013
fDate :
17-20 Nov. 2013
Firstpage :
101
Lastpage :
108
Abstract :
Microblogging websites such as Twitter produce tremendous amount of data each second. Consequently, real-time recommendation systems require very efficient algorithm to quickly proceed this massive amount of data. In this paper we propose a scalable and extensible way of building content-based user profiles. Scalability refers to the relative complexity of algorithms involved in building the users profiles with respect to state-of-the-art solutions. Extensibility consider avoiding to recompute the model for newcomers. We present a tractable algorithm to build user profiles out of their tweets. Our model is a graph of terms cooccurency, driven by the fact that user sharing similar interests will share similar terms. We then present how this model can be encoded as a binary footprint, hence boosting comparison of users. We provide an empirical study to measure how the distance between users in the hash space differs from distance between users using standard Information Retrieval techniques. This experiment is based on a Twitter dataset we crawled, and represents 25K users and 1 million tweets. Our approach is driven by real-time analysis requirements and is thus oriented on a trade-off between expressivity and efficiency. Experimental results shows that our approach outperforms vector space model by three orders of magnitude, with a precision of 58%.
Keywords :
data handling; graph theory; information retrieval; recommender systems; social networking (online); Twitter dataset; binary footprint; content-based user profiles; expressive Twitter user profiles; hash space; microblogging Web sites; real-time analysis requirements; real-time recommendation systems; scalable Twitter user profiles; standard information retrieval techniques; terms cooccurency graph; tractable algorithm; Buildings; Collaboration; Recommender systems; Sparse matrices; Twitter; recommendation; twitter;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013 IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4799-2902-3
Type :
conf
DOI :
10.1109/WI-IAT.2013.15
Filename :
6690000
Link To Document :
بازگشت