• DocumentCode
    2603986
  • Title

    An Improved Fuzzy Clustering Method for Text Mining

  • Author

    Deng, Jiabin ; Hu, Juanli ; Chi, Hehua ; Wu, Juebo

  • Author_Institution
    Comput. Eng. Dept., Zhongshan Polytech., Zhongshan, China
  • Volume
    1
  • fYear
    2010
  • fDate
    24-25 April 2010
  • Firstpage
    65
  • Lastpage
    69
  • Abstract
    In recent years, the text data of text mining has gradually become a new research topic. Among them, the study of the text clustering has attracted wide attention. This paper proposes an improved fuzzy clustering-text clustering method based on the fuzzy C-means clustering algorithm and the edit distance algorithm. We use the feature evaluation to reduce the dimensionality of high-dimensional text vector. Because the clustering results of the traditional fuzzy C-means clustering algorithm lack the stability, we introduce the high-power sample point set, the field radius and weight. Due to the boundary value attribution of the traditional fuzzy C-means clustering algorithm, we recommend the edit distance algorithm. The results show that the improved algorithm is applied to the text clustering, making the results of clustering more stable and accurate than the traditional FCM clustering algorithm.
  • Keywords
    data mining; feature extraction; pattern clustering; text analysis; text editing; edit distance algorithm; feature evaluation; fuzzy c-means clustering algorithm; text mining; Clustering algorithms; Clustering methods; Computer networks; Computer security; Data engineering; Information security; Partitioning algorithms; Stability; Text mining; Wireless communication; Edit Distance; Fuzzy Clustering; Text Clustering; Text Mining; component;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Networks Security Wireless Communications and Trusted Computing (NSWCTC), 2010 Second International Conference on
  • Conference_Location
    Wuhan, Hubei
  • Print_ISBN
    978-0-7695-4011-5
  • Electronic_ISBN
    978-1-4244-6598-9
  • Type

    conf

  • DOI
    10.1109/NSWCTC.2010.23
  • Filename
    5481117