Title :
A Text Clustering Algorithm Based on Find of Density Peaks
Author :
Peiyu Liu;Yingying Liu;Xiuyan Hou;Qingqing Li;Zhenfang Zhu
Author_Institution :
Shandong Yingcai Univ., Jinan, China
Abstract :
The text clustering is one of core problems in text mining and information retrieval field, clustering algorithm is divided into four categories: the partitioned clustering algorithm, the hierarchical clustering algorithm, density-based clustering algorithm, as well as intelligence clustering algorithm. However, most clustering algorithms cannot meet the demand of speed and self-adapting about text clustering. This paper proposed a text clustering algorithm based on find of density peaks. The algorithm was implemented by the calculation of text distance and density, which was in accordance with calculation of the text vector similarity. SVM was used to express text to obtain the vector mapping for the similarity calculation. The next work was the finding of the local density and the distance from points of higher density of each text, removing the noise points, selecting the cluster center. The remaining points were assigned into the cluster which its nearest cluster center represented. According to several sets of contrast experiment, the density-based text clustering has an advantage of reliability and robustness.
Keywords :
"Clustering algorithms","Partitioning algorithms","Clustering methods","Robustness","Text mining","Information retrieval","Algorithm design and analysis"
Conference_Titel :
Information Technology in Medicine and Education (ITME), 2015 7th International Conference on
DOI :
10.1109/ITME.2015.103