DocumentCode
2030071
Title
Improve K-means clustering for audio data by exploring a reasonable sampling rate
Author
Chen, Gang ; Han, Bo
Author_Institution
Int. Sch. of Software, Wuhan Univ., Wuhan, China
Volume
4
fYear
2010
fDate
10-12 Aug. 2010
Firstpage
1639
Lastpage
1642
Abstract
K-means clustering is sensitive to starting points and its time cost is expensive for large scale of data, such as audio. Sampling approach is widely applied to find “better” starting points for speeding up the clustering converging procedure. However, how to choose a reasonable sampling-rate remains a problem. In this paper, we reported our initial exploration of locating reasonable sampling-rates for different datasets. The procedure progressively increases sampling-rates and choose the cluster centers in the previous stage as the starting points for next clustering. The resulted relationship curve between sampling-rate and iteration number illustrates a turning point as reasonable sampling-rate. Based on two audio experimental data, the procedure can more efficiently cluster data while keeping similar clustering quality.
Keywords
data mining; pattern clustering; K-means clustering; audio data; data clustering; reasonable sampling-rate; Algorithm design and analysis; Clustering algorithms; Data mining; Presses; Shape; Software; Software algorithms; K-means; audio; clustering; sampling-rate; starting points;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location
Yantai, Shandong
Print_ISBN
978-1-4244-5931-5
Type
conf
DOI
10.1109/FSKD.2010.5569371
Filename
5569371
Link To Document