DocumentCode
167610
Title
MRPrePost—A parallel algorithm adapted for mining big data
Author
Jinggui Liao ; Yuelong Zhao ; Saiqin Long
Author_Institution
Comput. Sci. & Eng., South China Univ. of Technol., Guangzhou, China
fYear
2014
fDate
8-9 May 2014
Firstpage
564
Lastpage
568
Abstract
With the explosive growth in data, using data mining techniques to mine association rules, and then to find valuable information hidden in big data has become increasingly important. Various existing data mining techniques often through mining frequent itemsets to derive association rules and access to relevant knowledge, but with the rapid arrival of the era of big data, Traditional data mining algorithms have been unable to meet large data´s analysis needs. In view of this, this paper proposes an adaptation to the big data mining parallel algorithms-MRPrePost. MRPrePost is a parallel algorithm based on Hadoop platform, which improves PrePost by way of adding a prefix pattern, and on this basis into the parallel design ideas, making MRPrePost algorithm can adapt to mining large data´s association rules. Experiments show that MRPrePost algorithm is more superior than PrePost and PFP in terms of performance, and the stability and scalability of algorithms are better.
Keywords
Big Data; data analysis; data mining; parallel algorithms; public domain software; Hadoop platform; MRPrePost algorithm; algorithm scalability; algorithm stability; big data mining; data analysis; data association rule mining; frequent itemset mining; parallel algorithm; parallel design; prefix pattern; IP networks; Itemsets; Knowledge engineering; Polymers; MRPrePost algorithm; PrePost algorithm; big data; data mining; parallel iz-ation;
fLanguage
English
Publisher
ieee
Conference_Titel
Electronics, Computer and Applications, 2014 IEEE Workshop on
Conference_Location
Ottawa, ON
Type
conf
DOI
10.1109/IWECA.2014.6845683
Filename
6845683
Link To Document