DocumentCode
1768985
Title
Parallel processing for distance-based outlier detection on a multi-core CPU
Author
Oku, Junki ; Tamura, Keiichi ; Kitakami, Hajime
Author_Institution
Grad. Sch. of Inf. Sci., Hiroshima City Univ., Hiroshima, Japan
fYear
2014
fDate
7-8 Nov. 2014
Firstpage
65
Lastpage
70
Abstract
Outliers are data objects that are not highly likely to occur. These are unusualness data objects such as errors, fraud data, and rare data. In the last few decades, outlier detection has attracted much attention from researchers, because it is widely used for many different application domains. Distance-based outlier detection, which is a non-parametric approach, identifies unusual data objects in a database, where their distance to neighbors is used as a measure of unusualness. Algorithms for distance-based outlier detection are known for their significant computation time. One of the most successful algorithms for the improvement of the distance-based outlier detection algorithms is Orca, which is based on nested loop with randomization and a simple pruning rule. In this paper, we propose a new parallelization model for the parallel processing of Orca-based outlier detection on a multi-core CPU. The proposed parallelization model utilizes data parallelism and a multi-thread model. In the processing of Orca, we need to share an outlier-score table and a cutoff value for pruning among worker threads. To reduce conflicts on sharing, the proposed parallelization model manages outlier-score tables hierarchically and makes the cache of the cutoff value on each worker thread. The experimental results show that the proposed parallelization model outperforms a conventional parallelization model, which utilizes only the data parallelism without managing outlier-score tables hierarchically.
Keywords
data handling; multiprocessing systems; parallel processing; Orca algorithm; central processing unit; cutoff value; data parallelism; distance-based outlier detection; multicore CPU; multithread model; nested loop; outlier-score table; parallel processing; parallelization model; worker thread; Data models; Databases; Detection algorithms; Image color analysis; Instruction sets; Object recognition; Parallel processing; distance-based outlier detection; multi-core CPU; outlier; parallel processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Applications (IWCIA), 2014 IEEE 7th International Workshop on
Conference_Location
Hiroshima
ISSN
1883-3977
Print_ISBN
978-1-4799-4771-3
Type
conf
DOI
10.1109/IWCIA.2014.6988080
Filename
6988080
Link To Document