Title :
Sequential Pattern Analysis with Right Granularity
Author :
Chuanren Liu ; Kai Zhang ; Hui Xiong
Author_Institution :
Rutgers, State Univ. of New Jersey, Piscataway, NJ, USA
Abstract :
Sequential pattern analysis targets on finding statistically relevant temporal structures where the values are delivered in a sequence. This is a fundamental problem in data mining with diversified applications in many science and business fields, such as multimedia analysis (motion gesture/video sequence recognition), marketing analytics (buying path prediction), and financial modelling (trend of stock prices). Given the overwhelming scale and the heterogeneous nature of the sequential data, new techniques for sequential pattern analysis are required to derive competitive advantages and unlock the power of the big data. In this dissertation, we develop novel approaches for sequential pattern analysis with applications in dynamic business environments, including operation and management tasks in healthcare industry as well as B2B (Business-to-Business) marketing. Our major contribution is to identify the right granularity for sequential pattern analysis, including both sequential pattern modelling and mining. Due to space limitation, this submission presents mainly the "temporal skeletonization", our approach to identifying the meaningful granularity for sequential pattern mining. Our key idea is to summarize the temporal correlations in an undirected graph. Then, the "skeleton" of the graph serves as a higher granularity on which hidden temporal patterns are more likely to be identified. In the meantime, the embedding topology of the graph allows us to translate the rich temporal content into a metric space. This opens up new possibilities to explore, quantify, and visualize sequential data. Our approach has shown to provide substantial improvements over the state-of-the-art methods in challenging tasks of sequential pattern mining and sequence clustering. Evaluation on a Business-to-Business (B2B) marketing application demonstrates that our approach can effectively discover critical buying paths from noisy customer event data.
Keywords :
Big Data; data mining; data visualisation; health care; marketing; pattern classification; B2B; big data; business-to-business marketing; buying path prediction; data mining; dynamic business environments; financial modelling; healthcare industry; marketing analytics; motion gesture; multimedia analysis; noisy customer event data; rich temporal content; right granularity; sequential data exploration; sequential data quantification; sequential data visualization; sequential pattern analysis; sequential pattern mining; sequential pattern modelling; statistically relevant temporal structures; stock price trend; undirected graph; video sequence recognition; Business; Data mining; Data models; Data visualization; Hidden Markov models; Medical services; Pattern analysis;
Conference_Titel :
Data Mining Workshop (ICDMW), 2014 IEEE International Conference on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-4799-4275-6
DOI :
10.1109/ICDMW.2014.164