DocumentCode :
2849799
Title :
Efficient density-based clustering of complex objects
Author :
Brecheisen, Stefan ; Kriegel, Hans-Peter ; Pfeifle, Martin
Author_Institution :
Inst. for Comput. Sci., Munich Univ., Germany
fYear :
2004
fDate :
1-4 Nov. 2004
Firstpage :
43
Lastpage :
50
Abstract :
Nowadays, data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many different application domains, complex object representations along with complex distance functions are used for measuring the similarity between objects. Often, not only these complex distance measures are available but also simpler distance functions which can be computed much more efficiently. Traditionally, the well known concept of multi-step query processing which is based on exact and lower-bounding approximative distance functions are used independently of data mining algorithms. In this paper, we demonstrate how the paradigm of multi-step query processing can be integrated into the two density-based clustering algorithms DBSCAN and OPTICS resulting in a considerable efficiency boost. Our approach tries to confine itself to ε-range queries on the simple distance functions and carries out complex distance computations only at that stage of the clustering algorithm where they are compulsory to compute the correct clustering result. In a broad experimental evaluation based on real-world test data sets, we demonstrate that our approach accelerates the generation of flat and hierarchical density-based clusterings by more than one order of magnitude.
Keywords :
data mining; pattern clustering; query processing; very large databases; DBSCAN; OPTICS; approximative distance function; clustering algorithm; complex distance computations; complex distance function; complex distance measures; complex object representation; data mining; density-based clustering; large databases; multistep query processing; test data sets; Acceleration; Application software; Clustering algorithms; Computer science; Data engineering; Data mining; Image databases; Integrated optics; Multimedia databases; Query processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
Print_ISBN :
0-7695-2142-8
Type :
conf
DOI :
10.1109/ICDM.2004.10082
Filename :
1410265
Link To Document :
بازگشت