Title :
Clustering Heterogeneous Data Sets
Author :
Abdullin, Artur ; Nasraoui, Olfa
Author_Institution :
Dept. of Comput. Eng. & Comput. Sci., Univ. of Louisville, Louisville, KY, USA
Abstract :
Recent years have seen an increasing interest in clustering data comprising multiple domains or modalities, such as categorical, numerical and transactional, etc. This kind of data is sometimes found within the context of clustering multiview, heterogeneous, or multimodal data. Traditionally, different types of attributes or domains have been handled by first combining them into one format (possibly using some type of conversion) and then following with a traditional clustering algorithm, or computing a combined distance matrix that takes into account the distance values for each domain, then following with a relational or graph clustering approach. In other cases where data consists of multiple views, multiview clustering has been used to cluster the data. In this paper, we review the existing approaches such as multiview clustering and discuss several additional approaches that can be harnessed for the purpose of clustering heterogeneous data once they are adapted for this purpose. The additional approaches include ensemble clustering, collaborative clustering and semi-supervised clustering.
Keywords :
learning (artificial intelligence); pattern clustering; SSL; collaborative clustering; combined distance matrix; data clustering; ensemble clustering; graph clustering approach; heterogeneous data sets; multiview clustering; semisupervised clustering; Clustering algorithms; Collaboration; Distortion measurement; Hidden Markov models; Linear programming; Partitioning algorithms; Peer to peer computing; clustering; heterogeneous data set;
Conference_Titel :
Web Congress (LA-WEB), 2012 Eighth Latin American
Conference_Location :
Cartagena de Indias
Print_ISBN :
978-1-4673-4473-9
DOI :
10.1109/LA-WEB.2012.27