Title :
A Data Science Solution for Mining Interesting Patterns from Uncertain Big Data
Author :
Leung, Carson Kai-Sang ; Fan Jiang
Author_Institution :
Dept. of Comput. Sci., Univ. of Manitoba, Winnipeg, MB, Canada
Abstract :
Nowadays, high volumes of valuable uncertain data can be easily collected or generated at high velocity in many real-life applications. Mining these uncertain Big data is computationally intensive due to the presence of existential probability values associated with items in every transaction in the uncertain data. Each existential probability value expresses the likelihood of that item to be present in a particular transaction in the Big data. In some situations, users may be interested in mining all frequent patterns from these uncertain Big data, in other situations, users may be interested in only a tiny portion of these mined patterns. To reduce the computation and to focus the mining for the latter situations, we propose a tree-based algorithm that (i) allows users to express the patterns to be mined according to their intention via the use of constraints and (ii) uses MapReduce to mine uncertain Big data for only those frequent patterns that satisfy user-specified constraints. Experimental results show the effectiveness of our algorithm in mining interesting patterns from uncertain Big data.
Keywords :
Big Data; data mining; parallel processing; probability; MapReduce; data science solution; existential probability values; frequent pattern mining; tree-based algorithm; uncertain Big Data; Association rules; Big data; Computational modeling; Data models; Databases; Program processors; Big data analytics; Big data and cloud computing; Big data applications; Big data mining; MapReduce;
Conference_Titel :
Big Data and Cloud Computing (BdCloud), 2014 IEEE Fourth International Conference on
Conference_Location :
Sydney, NSW
DOI :
10.1109/BDCloud.2014.136