مرکز منطقه ای اطلاع رساني علوم و فناوري - A Text Clustering Algorithm Based on Find of Density Peaks

DocumentCode :

3758986

Title :

A Text Clustering Algorithm Based on Find of Density Peaks

Author :

Peiyu Liu;Yingying Liu;Xiuyan Hou;Qingqing Li;Zhenfang Zhu

Author_Institution :

Shandong Yingcai Univ., Jinan, China

fYear :

2015

Firstpage :

348

Lastpage :

352

Abstract :

The text clustering is one of core problems in text mining and information retrieval field, clustering algorithm is divided into four categories: the partitioned clustering algorithm, the hierarchical clustering algorithm, density-based clustering algorithm, as well as intelligence clustering algorithm. However, most clustering algorithms cannot meet the demand of speed and self-adapting about text clustering. This paper proposed a text clustering algorithm based on find of density peaks. The algorithm was implemented by the calculation of text distance and density, which was in accordance with calculation of the text vector similarity. SVM was used to express text to obtain the vector mapping for the similarity calculation. The next work was the finding of the local density and the distance from points of higher density of each text, removing the noise points, selecting the cluster center. The remaining points were assigned into the cluster which its nearest cluster center represented. According to several sets of contrast experiment, the density-based text clustering has an advantage of reliability and robustness.

Keywords :

"Clustering algorithms","Partitioning algorithms","Clustering methods","Robustness","Text mining","Information retrieval","Algorithm design and analysis"

Publisher :

ieee

Conference_Titel :

Information Technology in Medicine and Education (ITME), 2015 7th International Conference on

Type :

conf

DOI :

10.1109/ITME.2015.103

Filename :

7429163

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3758986