Interval Data Clustering with Applications

Author

Peng, Wei ; Li, Tao

Author_Institution

Sch. of Comput. Sci., Florida Int. Univ., Miami, FL

fYear

2006

fDate

Nov. 2006

Firstpage

355

Lastpage

362

Abstract

Interval data is described by a group of variables, each of which contains a range of continuous values instead of the traditional single continuous or discrete value. Traditional data analysis simply replaces each interval by its representative (e.g., center or mean) and ignores the structure information of intervals. In this paper, we study the problem of clustering interval data using the modified or extended interval data dissimilarity measures. Our contributions are two-fold. First, we discuss various approaches for measuring the dissimilarities/distances between interval data, investigate the relations among them, and present a comprehensive experimental study on clustering interval data. We show that the extended interval data clustering achieves better performance than traditional ones and produces more meaningful and explanatory results. Second, we propose a two-stage approach for clustering interval data by exploiting the relations between the traditional distances and the modified distances. Experimental results show the effectiveness of our approach

Keywords

data analysis; data structures; pattern clustering; interval data clustering; interval data dissimilarity measures; traditional data analysis; Application software; Artificial intelligence; Clustering algorithms; Computational efficiency; Computer science; Data analysis; Euclidean distance; Histograms;

fLanguage

English

Publisher

ieee

Conference_Titel

Tools with Artificial Intelligence, 2006. ICTAI '06. 18th IEEE International Conference on

Conference_Location

Arlington, VA

ISSN

1082-3409

Print_ISBN

0-7695-2728-0

Type

conf

DOI

10.1109/ICTAI.2006.71

Filename

4031919