Title :
Microblogging sentiment analysis with lexical based and machine learning approaches
Author_Institution :
Fac. of Inf., Telkom Inst. of Technol., Bandung, Indonesia
Abstract :
The Digital World encounters rapid development nowadays, especially through the proliferation of social media in Indonesia. Twitter has become one of social media with expanded users within every sectors of society. There are so many part both individual as well as organization/enterprise which utilize twitter as tool for communication, business, customer relation, and other activities. Through the twitter´s ever-expanding users with those particular purposes, the precise method to effectively and efficiently analyzing opinion-contained sentences become crucially needed. Therefore this research made for method analyzing through lexical based and model based approaches by machine learning to classify opinion-contained tweets using those 2 methods. The tested machine learning method are Support Vector Machine (SVM), Maximum Entropy (ME), Multinomial Naive Bayes (MNB), and k-Nearest Neighbor (k-NN). Based on the test outcome, lexical based approach highly depended on lexical database which became opinion classification matrix. Whilst machine learning approach can produce better accuracy due to its capability in new training data modeling based on outcome model. However, machine learning model based approach depends on various factors in analyzing sentiment.
Keywords :
Bayes methods; data models; learning (artificial intelligence); maximum entropy methods; pattern classification; social networking (online); support vector machines; text analysis; Indonesia; ME; MNB; SVM; Twitter; business; communication tool; customer relation; digital world; enterprise; k-NN; k-nearest neighbor; lexical based approach; lexical database; machine learning; maximum entropy; microblogging sentiment analysis; model based approach; multinomial naive Bayes; opinion classification matrix; opinion-contained sentences; opinion-contained tweet classification; organization; social media; support vector machine; training data modeling; Accuracy; Communications technology; Data mining; Data models; Databases; Entropy; Support vector machines; Maximum Entropy; Multinomial Naive Bayes; Support Vector Machine; Twitter; k-Nearest Neighbor; lexical based; machine learning; tweet;
Conference_Titel :
Information and Communication Technology (ICoICT), 2013 International Conference of
Conference_Location :
Bandung
Print_ISBN :
978-1-4673-4990-1
DOI :
10.1109/ICoICT.2013.6574616