How to Perform Incremental Clustering - A SOM Based View

Author

Chen Lei;Wu Chong

Author_Institution

Sch. of Manage., Harbin Inst. of Technol., Harbin, China

fYear

2015

Firstpage

450

Lastpage

455

Abstract

Due to fast development of network technique, internet users have to face to massive textual data every day. Because of unsupervised merit of clustering, clustering is a good solution for users to analyze and organize texts into categories. However, most of recent clustering algorithms conduct in static situation. That indicates, it doesn´t allow clustering algorithm to deal with novel data efficiently. When novel data appear, traditional clustering algorithms can´t change their structure easily. Obviously, this restrict is not fit to internet, since novel data appear at any time. For this reason, an incremental clustering algorithm is proposed in this paper to cluster incremental data. This algorithm has two factors. (a) It designs two measures to calculate feature´s ability and integrate them in similarity measurement by replacing concurrence based similarity measurements. (b) Based on proposed similarity measurement, this algorithm selects few samples from original texts to perform incremental clustering. Experimental results demonstrate that, after integrating feature´s capacity, our algorithm can obtain high quality to cluster texts.

Keywords

"Transportation","Big data","Smart cities"

Publisher

ieee

Conference_Titel

Intelligent Transportation, Big Data and Smart City (ICITBS), 2015 International Conference on

Type

conf

DOI

10.1109/ICITBS.2015.117

Filename

7384063