Title :
Parallel processing for distance-based outlier detection on a multi-core CPU
Author :
Oku, Junki ; Tamura, Keiichi ; Kitakami, Hajime
Author_Institution :
Grad. Sch. of Inf. Sci., Hiroshima City Univ., Hiroshima, Japan
Abstract :
Outliers are data objects that are not highly likely to occur. These are unusualness data objects such as errors, fraud data, and rare data. In the last few decades, outlier detection has attracted much attention from researchers, because it is widely used for many different application domains. Distance-based outlier detection, which is a non-parametric approach, identifies unusual data objects in a database, where their distance to neighbors is used as a measure of unusualness. Algorithms for distance-based outlier detection are known for their significant computation time. One of the most successful algorithms for the improvement of the distance-based outlier detection algorithms is Orca, which is based on nested loop with randomization and a simple pruning rule. In this paper, we propose a new parallelization model for the parallel processing of Orca-based outlier detection on a multi-core CPU. The proposed parallelization model utilizes data parallelism and a multi-thread model. In the processing of Orca, we need to share an outlier-score table and a cutoff value for pruning among worker threads. To reduce conflicts on sharing, the proposed parallelization model manages outlier-score tables hierarchically and makes the cache of the cutoff value on each worker thread. The experimental results show that the proposed parallelization model outperforms a conventional parallelization model, which utilizes only the data parallelism without managing outlier-score tables hierarchically.
Keywords :
data handling; multiprocessing systems; parallel processing; Orca algorithm; central processing unit; cutoff value; data parallelism; distance-based outlier detection; multicore CPU; multithread model; nested loop; outlier-score table; parallel processing; parallelization model; worker thread; Data models; Databases; Detection algorithms; Image color analysis; Instruction sets; Object recognition; Parallel processing; distance-based outlier detection; multi-core CPU; outlier; parallel processing;
Conference_Titel :
Computational Intelligence and Applications (IWCIA), 2014 IEEE 7th International Workshop on
Conference_Location :
Hiroshima
Print_ISBN :
978-1-4799-4771-3
DOI :
10.1109/IWCIA.2014.6988080