• DocumentCode
    710111
  • Title

    Dish comment summarization based on bilateral topic analysis

  • Author

    Rong Zhang ; Zhenjie Zhang ; Xiaofeng He ; Aoying Zhou

  • Author_Institution
    Shanghai Key Lab. of Trustworthy Comput., East China Normal Univ., Shanghai, China
  • fYear
    2015
  • fDate
    13-17 April 2015
  • Firstpage
    483
  • Lastpage
    494
  • Abstract
    With the prosperity of online services enabled by Web 2.0, huge amount of human generated commentary data are now available on the Internet, covering a wide range of domains on different products. Such comments contain valuable information for other customers, but are usually difficult to utilize due to the lack of common description structure, the complexity of opinion expression and fast growing data volume. Comment-based restaurant summarization is even more challenging than other types of products and services, as users´ comments on restaurants are usually mixed with opinions on different dishes but attached with only one overall evaluation score on the whole experience with the restaurants. It is thus crucial to distinguish well-made dishes from other lousy dishes by mining the comment archive, in order to generate meaningful and useful summaries for other potential customers. This paper presents a novel approach to tackle the problem of restaurant comment summarization, with a core technique on the new bilateral topic analysis model on the commentary text data. In the bilateral topic model, the attributes discussed in the comments on the dishes and the user´s evaluation on the attributes are considered as two independent dimensions in the latent space. Combined with new opinionated word extraction and clustering-based representation selection algorithms, our new analysis technique is effective to generate high-quality summary using representative snippets from the text comments. We evaluate our proposals on two real-world comment archives crawled from the most popular English and Chinese online restaurant review web sites, Yelp and Dianping. The experimental results verify the huge margin of advantage of our proposals on the summarization quality over baseline approaches in the literature.
  • Keywords
    Internet; catering industry; computational complexity; text analysis; Chinese online restaurant review Web sites; English online restaurant review Web sites; Internet; Web 2.0; bilateral topic analysis; clustering-based representation selection algorithm; comment-based restaurant summarization; dish comment summarization; fast growing data volume; high-quality summary; online services; opinion expression complexity; opinionated word extraction; representative snippets; Algorithm design and analysis; Analytical models; Correlation; Data mining; Hidden Markov models; Proposals; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2015 IEEE 31st International Conference on
  • Conference_Location
    Seoul
  • Type

    conf

  • DOI
    10.1109/ICDE.2015.7113308
  • Filename
    7113308