Title of article :
Extracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering
Author/Authors :
Sedighi ، Z. - Shiraz University , Boostani ، R. - Shiraz University
Pages :
9
From page :
287
To page :
295
Abstract :
Although several works have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper aims to propose an efficient approach to elicit a prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data to convert a blind clustering problem into a semi-supervised one. In order to estimate the density distribution of data, the Weibull Mixture Model is utilized due to its high flexibility. Another contribution of this work is to propose a new hill and valley seeking algorithm to find the constraints for a semi-supervised algorithm. The proposed valley-seeking algorithm does not require any user-defined parameter. It is assumed that each dominant density peak stands on a cluster center; therefore, the neighbor samples of each center are considered as the must-link samples, while the near-centroid samples belonging to different clusters are considered as the cannot-link ones. The proposed approach is applied to a standard image dataset (designed for clustering evaluation) of Berkeley University along with some UCI datasets. The results achieved on both databases demonstrate the superiority of the proposed method compared to the conventional clustering ones.
Keywords :
Semi , supervised , Clustering , Valley , seeking Scheme , Weibull Mixture Model
Journal title :
Journal of Artificial Intelligence Data Mining
Serial Year :
2018
Journal title :
Journal of Artificial Intelligence Data Mining
Record number :
2449351
Link To Document :
بازگشت