• DocumentCode
    2030071
  • Title

    Improve K-means clustering for audio data by exploring a reasonable sampling rate

  • Author

    Chen, Gang ; Han, Bo

  • Author_Institution
    Int. Sch. of Software, Wuhan Univ., Wuhan, China
  • Volume
    4
  • fYear
    2010
  • fDate
    10-12 Aug. 2010
  • Firstpage
    1639
  • Lastpage
    1642
  • Abstract
    K-means clustering is sensitive to starting points and its time cost is expensive for large scale of data, such as audio. Sampling approach is widely applied to find “better” starting points for speeding up the clustering converging procedure. However, how to choose a reasonable sampling-rate remains a problem. In this paper, we reported our initial exploration of locating reasonable sampling-rates for different datasets. The procedure progressively increases sampling-rates and choose the cluster centers in the previous stage as the starting points for next clustering. The resulted relationship curve between sampling-rate and iteration number illustrates a turning point as reasonable sampling-rate. Based on two audio experimental data, the procedure can more efficiently cluster data while keeping similar clustering quality.
  • Keywords
    data mining; pattern clustering; K-means clustering; audio data; data clustering; reasonable sampling-rate; Algorithm design and analysis; Clustering algorithms; Data mining; Presses; Shape; Software; Software algorithms; K-means; audio; clustering; sampling-rate; starting points;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-1-4244-5931-5
  • Type

    conf

  • DOI
    10.1109/FSKD.2010.5569371
  • Filename
    5569371