مرکز منطقه ای اطلاع رساني علوم و فناوري - Tweets mining using WIKIPEDIA and impurity cluster measurement

DocumentCode :

2643004

Title :

Tweets mining using WIKIPEDIA and impurity cluster measurement

Author :

Chen, Qing ; Shipper, Timothy ; Khan, Latifur

Author_Institution :

Dept. of Comput. Sci., Univ. of Texas at Dallas, Dallas, TX, USA

fYear :

2010

fDate :

23-26 May 2010

Firstpage :

141

Lastpage :

143

Abstract :

Twitter is one of the fastest growing online social networking services. Tweets can be categorized into trends, and are related with tags and follower/following social relationships. The categorization is neither accurate nor effective due to the short length of tweet messages and noisy data corpus. In this paper, we attempt to overcome these challenges with an extended feature vector along with a semi-supervised clustering technique. In order to achieve this goal, the training set is expanded with Wikipedia topic search result, and the feature set is extended. When building the clustering model and doing the classification, impurity measurement is introduced into our classifier platform. Our experiment results show that the proposed techniques outperform other classifiers with reasonable precision and recall.

Keywords :

Clustering algorithms; Computer science; Euclidean distance; Impurities; Nearest neighbor searches; Neural networks; Partitioning algorithms; Social network services; Twitter; Wikipedia; extended features; tweet mining; wikipedia;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligence and Security Informatics (ISI), 2010 IEEE International Conference on

Conference_Location :

Vancouver, BC, Canada

Print_ISBN :

978-1-4244-6444-9

Type :

conf

DOI :

10.1109/ISI.2010.5484758

Filename :

5484758

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2643004