• DocumentCode
    1768985
  • Title

    Parallel processing for distance-based outlier detection on a multi-core CPU

  • Author

    Oku, Junki ; Tamura, Keiichi ; Kitakami, Hajime

  • Author_Institution
    Grad. Sch. of Inf. Sci., Hiroshima City Univ., Hiroshima, Japan
  • fYear
    2014
  • fDate
    7-8 Nov. 2014
  • Firstpage
    65
  • Lastpage
    70
  • Abstract
    Outliers are data objects that are not highly likely to occur. These are unusualness data objects such as errors, fraud data, and rare data. In the last few decades, outlier detection has attracted much attention from researchers, because it is widely used for many different application domains. Distance-based outlier detection, which is a non-parametric approach, identifies unusual data objects in a database, where their distance to neighbors is used as a measure of unusualness. Algorithms for distance-based outlier detection are known for their significant computation time. One of the most successful algorithms for the improvement of the distance-based outlier detection algorithms is Orca, which is based on nested loop with randomization and a simple pruning rule. In this paper, we propose a new parallelization model for the parallel processing of Orca-based outlier detection on a multi-core CPU. The proposed parallelization model utilizes data parallelism and a multi-thread model. In the processing of Orca, we need to share an outlier-score table and a cutoff value for pruning among worker threads. To reduce conflicts on sharing, the proposed parallelization model manages outlier-score tables hierarchically and makes the cache of the cutoff value on each worker thread. The experimental results show that the proposed parallelization model outperforms a conventional parallelization model, which utilizes only the data parallelism without managing outlier-score tables hierarchically.
  • Keywords
    data handling; multiprocessing systems; parallel processing; Orca algorithm; central processing unit; cutoff value; data parallelism; distance-based outlier detection; multicore CPU; multithread model; nested loop; outlier-score table; parallel processing; parallelization model; worker thread; Data models; Databases; Detection algorithms; Image color analysis; Instruction sets; Object recognition; Parallel processing; distance-based outlier detection; multi-core CPU; outlier; parallel processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Applications (IWCIA), 2014 IEEE 7th International Workshop on
  • Conference_Location
    Hiroshima
  • ISSN
    1883-3977
  • Print_ISBN
    978-1-4799-4771-3
  • Type

    conf

  • DOI
    10.1109/IWCIA.2014.6988080
  • Filename
    6988080