DocumentCode
2187166
Title
Multi-classification of business types on twitter based on topic model
Author
Thongsuk, Chanattha ; Haruechaiyasak, Choochart ; Saelee, Somkid
Author_Institution
King Mongkut´´s Univ. of Technol. North Bangkok (KMUTNB), Bangkok, Thailand
fYear
2011
fDate
17-19 May 2011
Firstpage
508
Lastpage
511
Abstract
Today many businesses have adopted Twitter as a new marketing channel to promote their products and services. One of the potentially useful applications is to recommend users to follow businesses which match their interests. One possible solution is to apply classification algorithm to predict user´s Twitter posts into some predefined business categories. Due to the short length characteristic, classifying Twitter posts is very difficult and challenging. In this paper, we propose a feature processing framework for constructing text categorization models. A topic model is constructed from a set of terms based on the Latent Dirichlet Allocation (LDA) algorithm. We apply the topic model for two different feature processing approaches: (1) feature transformation, i.e., using a set of topics as features and (2) feature expansion, i.e., appending a set of topics to a set of terms. Experimental results show that the highest accuracy of 95.7% is obtained with feature expansion technique, an improvement of 18.7% over the Bag of Words (BOW) model.
Keywords
advertising data processing; pattern classification; social networking (online); Twitter; bag of words model; business type classification; classification algorithm; feature processing framework; feature transformation; latent Dirichlet allocation algorithm; marketing channel; text categorization models; Blogs; Encyclopedias; Information filters; Internet; Twitter; Latent Dirichlet Allocation (LDA); Multi-classification; Twitter; topic model;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), 2011 8th International Conference on
Conference_Location
Khon Kaen
Print_ISBN
978-1-4577-0425-3
Type
conf
DOI
10.1109/ECTICON.2011.5947886
Filename
5947886
Link To Document