Title :
Towards Summarizing Popular Information from Massive Tourism Blogs
Author :
Hua Yuan ; Hualin Xu ; Yu Qian ; Kai Ye
Author_Institution :
Sch. of Manage. & Econ., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
Abstract :
In this work, we propose a research method to summarize popular information from massive tourism blog data. First, we crawl blog contents from website and segment each of them into a semantic word vector separately. Then, we select the geographical terms in each word vector into a corresponding geographical term vector and present a new method to explore the hot tourism locations and, especially, their frequent sequential relations from a set of geographical term vectors. Third, we propose a novel word vector subdividing method to collect the local features for each hot location, and introduce the metric of max-confidence to identify the Things of Interest (ToI) associated to the location from the collected data. We illustrate the benefits of this approach by applying it to a Chinese online tourism blog data set. The experiment results show that the proposed method can be used to explore the hot locations, as well as their sequential relations and corresponding ToI, efficiently.
Keywords :
Web sites; travel industry; vectors; Things of Interest; ToI; Web site; blog contents; geographical term vectors; massive tourism blogs; popular information summarization; semantic word vector; Blogs; Cleaning; Correlation; Data mining; Measurement; Semantics; Vectors; blog mining; hot tourism locations; max-confidence; things of interest;
Conference_Titel :
Data Mining Workshop (ICDMW), 2014 IEEE International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-4799-4275-6
DOI :
10.1109/ICDMW.2014.29