Title :
Parallel pattern mining - application of GSP algorithm for Graphics Processing Units
Author :
Hryniów, Krzysztof
Author_Institution :
Inst. of Control & Ind. Electron., Warsaw Univ. of Technol., Warsaw, Poland
Abstract :
Frequent pattern mining is the field with many practical applications, where large computational power and speed are needed. Many solutions, both software and hardware, are proposed for those applications, but specialised solutions in form of embedded systems are not so common as one could imagine. This is especially true when we consider problems that can be paralleled. Many of the state-of-the-art frequent pattern mining applications are inefficient when used on shared memory systems or multiprocessor systems. To solve this problem both hardware and software solutions are proposed - remapping system architecture, improving memory performance, modifying task allocation. This article proposes modification of classical frequent pattern mining algorithms from Apriori family, illustrated with the example of popular GSP algorithm. Addition of GPU (Graphics Processing Unit) or multiple GPUs to embedded system is proposed and algorithm is modified in such a way, that it is best suited for solving GPGPU (general-purpose computation on graphics hardware) problems. Both theoretical and experimental evaluation of modifications are made, the latter with use of setup consisting of NVIDIA Tesla card and CUDA parallel computing platform. It is shown in the article, that for tested data sets modified GSP algorithm finishes finding frequent sequences 50-100 times faster and with the same accuracy. Such speed-up allows the use of classical pattern mining algorithms for real-time solutions. This also permits broader scope of algorithms to be used in embedded systems with real-time constraints.
Keywords :
data mining; embedded systems; graphics processing units; multiprocessing systems; parallel processing; shared memory systems; Apriori family; CUDA parallel computing platform; GPGPU; GSP algorithm; NVIDIA Tesla card; computational power; computational speed; embedded systems; frequent pattern mining; general purpose computation on graphics hardware; graphics processing unit; graphics processing units; memory performance; multiprocessor systems; parallel pattern mining application; shared memory systems; state-of-the-art frequent pattern mining applications; task allocation; Computational complexity; Data mining; Embedded systems; Graphics processing unit; Hardware; Itemsets; Real time systems; CUDA; GPGPU; GSP; embedded systems; sequential pattern mining;
Conference_Titel :
Carpathian Control Conference (ICCC), 2012 13th International
Conference_Location :
High Tatras
Print_ISBN :
978-1-4577-1867-0
DOI :
10.1109/CarpathianCC.2012.6228645