DocumentCode :
710111
Title :
Dish comment summarization based on bilateral topic analysis
Author :
Rong Zhang ; Zhenjie Zhang ; Xiaofeng He ; Aoying Zhou
Author_Institution :
Shanghai Key Lab. of Trustworthy Comput., East China Normal Univ., Shanghai, China
fYear :
2015
fDate :
13-17 April 2015
Firstpage :
483
Lastpage :
494
Abstract :
With the prosperity of online services enabled by Web 2.0, huge amount of human generated commentary data are now available on the Internet, covering a wide range of domains on different products. Such comments contain valuable information for other customers, but are usually difficult to utilize due to the lack of common description structure, the complexity of opinion expression and fast growing data volume. Comment-based restaurant summarization is even more challenging than other types of products and services, as users´ comments on restaurants are usually mixed with opinions on different dishes but attached with only one overall evaluation score on the whole experience with the restaurants. It is thus crucial to distinguish well-made dishes from other lousy dishes by mining the comment archive, in order to generate meaningful and useful summaries for other potential customers. This paper presents a novel approach to tackle the problem of restaurant comment summarization, with a core technique on the new bilateral topic analysis model on the commentary text data. In the bilateral topic model, the attributes discussed in the comments on the dishes and the user´s evaluation on the attributes are considered as two independent dimensions in the latent space. Combined with new opinionated word extraction and clustering-based representation selection algorithms, our new analysis technique is effective to generate high-quality summary using representative snippets from the text comments. We evaluate our proposals on two real-world comment archives crawled from the most popular English and Chinese online restaurant review web sites, Yelp and Dianping. The experimental results verify the huge margin of advantage of our proposals on the summarization quality over baseline approaches in the literature.
Keywords :
Internet; catering industry; computational complexity; text analysis; Chinese online restaurant review Web sites; English online restaurant review Web sites; Internet; Web 2.0; bilateral topic analysis; clustering-based representation selection algorithm; comment-based restaurant summarization; dish comment summarization; fast growing data volume; high-quality summary; online services; opinion expression complexity; opinionated word extraction; representative snippets; Algorithm design and analysis; Analytical models; Correlation; Data mining; Hidden Markov models; Proposals; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2015 IEEE 31st International Conference on
Conference_Location :
Seoul
Type :
conf
DOI :
10.1109/ICDE.2015.7113308
Filename :
7113308
Link To Document :
بازگشت