Title :
Mining Online Reviews for Predicting Sales Performance: A Case Study in the Movie Domain
Author :
Yu, Xiaohui ; Liu, Yang ; Huang, Xiangji ; An, Aijun
Author_Institution :
Sch. of Comput. Sci. & Technol., Shandong Univ., Jinan, China
fDate :
4/1/2012 12:00:00 AM
Abstract :
Posting reviews online has become an increasingly popular way for people to express opinions and sentiments toward the products bought or services received. Analyzing the large volume of online reviews available would produce useful actionable knowledge that could be of economic values to vendors and other interested parties. In this paper, we conduct a case study in the movie domain, and tackle the problem of mining reviews for predicting product sales performance. Our analysis shows that both the sentiments expressed in the reviews and the quality of the reviews have a significant impact on the future sales performance of products in question. For the sentiment factor, we propose Sentiment PLSA (S-PLSA), in which a review is considered as a document generated by a number of hidden sentiment factors, in order to capture the complex nature of sentiments. Training an S-PLSA model enables us to obtain a succinct summary of the sentiment information embedded in the reviews. Based on S-PLSFA, we propose ARSA, an Autoregressive Sentiment-Aware model for sales prediction. We then seek to further improve the accuracy of prediction by considering the quality factor, with a focus on predicting the quality of a review in the absence of user-supplied indicators, and present ARSQA, an Autoregressive Sentiment and Quality Aware model, to utilize sentiments and quality for predicting product sales performance. Extensive experiments conducted on a large movie data set confirm the effectiveness of the proposed approach.
Keywords :
autoregressive processes; data mining; entertainment; information analysis; S-PLSA model; actionable knowledge; autoregressive sentiment aware model; hidden sentiment factors; movie data set; movie domain; online reviews mining; product sales performance prediction; quality aware model; quality factor; sentiment PLSA; user supplied indicators; Biological system modeling; Information services; Internet; Marketing and sales; Motion pictures; Predictive models; Web sites; Review mining; prediction.; sentiment analysis;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2010.269