مرکز منطقه ای اطلاع رساني علوم و فناوري - The Anatomy of Mr. Scan: A Dissection of Performance of an Extreme Scale GPU-Based Clustering Algorithm

DocumentCode :

234511

Title :

The Anatomy of Mr. Scan: A Dissection of Performance of an Extreme Scale GPU-Based Clustering Algorithm

Author :

Welton, Benjamin ; Miller, Barton P.

Author_Institution :

Comput. Sci. Dept., Univ. of Wisconsin, Madison, WI, USA

fYear :

2014

fDate :

17-17 Nov. 2014

Firstpage :

Lastpage :

Abstract :

The emergence of leadership class systems with GPU-equipped nodes has the potential to vastly increase the performance of existing distributed applications. However, the inclusion of GPU computation into existing extreme scale distributed applications can reveal scalability issues that were absent in the CPU version. The issues exposed in scaling by a GPU can become limiting factors to overall application performance. We developed an extreme scale GPU-based application to perform data clustering on multi-billion point datasets. In this application, called Mr. Scan, we ran into several of these performance limiting issues. Through the use of complete end-to-end benchmarking of Mr. Scan (measuring time from reading and distribution to final output), we were able to identify three major sources of real world performance issues: data distribution, GPU load balancing, and system specific issues such as start-up time. These issues comprised a vast majority of the run time of Mr. Scan. Data distribution alone accounted for 68% of the total run time of Mr. Scan when processing 6.5 billion points on Cray Titan at 8192 nodes. With improvements in these areas, we have been able able to cut total run time of Mr. Scan from 17.5 minutes to 8.3 minutes when clustering 6.5 billion points.

Keywords :

distributed algorithms; graphics processing units; multiprocessing systems; pattern clustering; resource allocation; CPU version; GPU load balancing; GPU-equipped nodes; Mr. scan anatomy; data clustering; data distribution; extreme scale GPU-based clustering algorithm; leadership class systems; multibillion point datasets; time 17.5 min to 8.3 min; Benchmark testing; Clustering algorithms; Graphics processing units; Load management; Partitioning algorithms; Scalability; Spatial indexes; Distributed Systems; GPU Data Clustering; DBSCAN; Performance Analysis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), 2014 5th Workshop on

Conference_Location :

New Orleans, LA

Type :

conf

DOI :

10.1109/ScalA.2014.10

Filename :

7016734

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=234511