DocumentCode :
2546383
Title :
Design tradeoffs in a hardware implementation of the k-means clustering algorithm
Author :
Leeser, Miriam ; Theiler, James ; Estlick, Michael ; Szymanski, John J.
Author_Institution :
Dept. of Electr. & Comput. Eng., Northeastern Univ., Boston, MA, USA
fYear :
2000
fDate :
2000
Firstpage :
520
Lastpage :
524
Abstract :
Hyperspectral imagery provides exquisitely detailed information but poses a serious challenge to the image analyst. Massive quantities of data must be reduced in a way that identifies useful features. Pixel clustering algorithms provide one approach: each pixel is assigned to a class based on the spectral similarity of that pixel to other members of the class. An image where each pixel is represented by its class provides a single picture (not hundreds of channels) for the analyst to interpret. Opportunities for fine-grain parallelism in this computationally intensive calculation motivate the consideration of specialized hardware. We are developing an FPGA-based implementation of the k-means clustering algorithm, because FPGAs can provide considerable speedup yet are sufficiently flexible to permit the testing of variants and tradeoffs. Assigning pixels to classes requires a definition of spectral similarity. The standard choice is the Euclidean distance. We consider two alternative distances: the Manhattan (sum of the absolute coordinate differences) and the Max (maximum of the absolute differences). The Euclidean distance is optimal in the sense of minimizing within-class variance, but requires computing many squares. The Manhattan and Max distances eliminate this squaring, and furthermore reduce the number of bits needed to store intermediate calculations. We conducted two sets of experiments to assess the tradeoff between quality of clustering and efficiency of hardware implementation. The first considered an idealized setup for assessing mis-classification rates and their effect on within-class variance. The second evaluated within-class variance for real 224-channel AVIRIS data sets. Both experiments confirmed that the highest quality clusters are achieved using the Euclidean metric, and quantified the difference
Keywords :
field programmable gate arrays; image classification; image representation; parallel algorithms; pattern clustering; spectral analysis; AVIRIS data sets; Euclidean distance; Euclidean metric; FPGA; Manhattan distance; absolute coordinate differences sum; clustering quality; design tradeoffs; fine-grain parallelism; hardware implementation; hyperspectral imagery; image representation; k-means clustering algorithm; maximum absolute differences; mis-classification rates; pixel clustering algorithms; spectral similarity; within-class variance minimization; Clustering algorithms; Concurrent computing; Euclidean distance; Hardware; Hyperspectral imaging; Hyperspectral sensors; Image analysis; Information analysis; Parallel processing; Pixel;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Sensor Array and Multichannel Signal Processing Workshop. 2000. Proceedings of the 2000 IEEE
Conference_Location :
Cambridge, MA
Print_ISBN :
0-7803-6339-6
Type :
conf
DOI :
10.1109/SAM.2000.878063
Filename :
878063
Link To Document :
بازگشت