DocumentCode :
2187166
Title :
Multi-classification of business types on twitter based on topic model
Author :
Thongsuk, Chanattha ; Haruechaiyasak, Choochart ; Saelee, Somkid
Author_Institution :
King Mongkut´´s Univ. of Technol. North Bangkok (KMUTNB), Bangkok, Thailand
fYear :
2011
fDate :
17-19 May 2011
Firstpage :
508
Lastpage :
511
Abstract :
Today many businesses have adopted Twitter as a new marketing channel to promote their products and services. One of the potentially useful applications is to recommend users to follow businesses which match their interests. One possible solution is to apply classification algorithm to predict user´s Twitter posts into some predefined business categories. Due to the short length characteristic, classifying Twitter posts is very difficult and challenging. In this paper, we propose a feature processing framework for constructing text categorization models. A topic model is constructed from a set of terms based on the Latent Dirichlet Allocation (LDA) algorithm. We apply the topic model for two different feature processing approaches: (1) feature transformation, i.e., using a set of topics as features and (2) feature expansion, i.e., appending a set of topics to a set of terms. Experimental results show that the highest accuracy of 95.7% is obtained with feature expansion technique, an improvement of 18.7% over the Bag of Words (BOW) model.
Keywords :
advertising data processing; pattern classification; social networking (online); Twitter; bag of words model; business type classification; classification algorithm; feature processing framework; feature transformation; latent Dirichlet allocation algorithm; marketing channel; text categorization models; Blogs; Encyclopedias; Information filters; Internet; Twitter; Latent Dirichlet Allocation (LDA); Multi-classification; Twitter; topic model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), 2011 8th International Conference on
Conference_Location :
Khon Kaen
Print_ISBN :
978-1-4577-0425-3
Type :
conf
DOI :
10.1109/ECTICON.2011.5947886
Filename :
5947886
Link To Document :
بازگشت