Title :
Online reviews sentiment analysis applying mutual information
Author :
Wang Zuhui ; Jiang Wei
Author_Institution :
Res. Center of Inf. Manage. & Inf. Syst., Harbin Inst. of Technol., Harbin, China
Abstract :
The extraction of complicated features is essential to the performance of online review sentiment analysis. Aside from conventional word bag features, the regular collocation features play more and more important role in that their structured expression shows great impact on the sentiment orientation. The presented paper propose to apply the mutual information method to mine the complicated features from online reviews, and extend features extraction from the conventional word bags to regular collocations. With extracted collocation features as inputs of Naive Bayes analysis model, experiments on online hotel reviews data show that the presented extraction method improves the performance of Naive Bayes model by 1.36%, and improves the performance of Maximum Entropy model by 0.92%. On the other hand the imbalance between positive and negative reviews leads to foul play where the majority features conceal the minority ones, and also the extreme sentiment of the minority introduces noise into the dataset. With respect to the imbalance problem and corresponding parameter estimation problem, one λ feature filtering strategy and Good Turing smooth method is adopted to improve further the performance of the sentiment analysis model.
Keywords :
Bayes methods; Internet; electronic commerce; collocation features; complicated features extraction; feature filtering strategy; good turing smooth method; maximum entropy model; mutual information; naive Bayes analysis model; online hotel reviews data; online reviews sentiment analysis; sentiment orientation; word bag features; Analytical models; Data mining; Entropy; Feature extraction; Filtering; Mutual information; Support vector machines; Good Turing; Mutual information; Naive Bayes; Sentiment analysis;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location :
Sichuan
Print_ISBN :
978-1-4673-0025-4
DOI :
10.1109/FSKD.2012.6233865