DocumentCode :
1819958
Title :
A clustering method using an irregular size cell graph
Author :
Nakamura, Tomotake ; Kamidoi, Yoko ; Wakabayashi, Shin´ichi ; Yoshida, Noriyoshi
Author_Institution :
Graduate Sch. of Inf. Sci., Hiroshima City Univ., Japan
fYear :
2005
fDate :
3-4 April 2005
Firstpage :
19
Lastpage :
26
Abstract :
In this paper we propose a clustering method (data mining technique) called "FlexDice" for large high-dimensional datasets. The data structure used in FlexDice is a graph-structure. Its data structure and the data structure of Quadtree have a few same features, but they have some crucial differences. The most crucial difference is that the data structure of Quadtree is a tree-structure while the data structure used in FlexDice is a graph-structure. In this paper we show the differences between these structures. Quadtree is a tree-structure, and the algorithm constructing it forms cells hierarchically by dividing data object spaces in a top-down manner. That is why traversing operations from the root of the tree to each of its leaves is necessary in such methods of searching for or indexing of data objects. In contrast to the case of Quadtree, no tree-structure is required in the algorithm FlexDice, because such traversing operations are unnecessary. However in the clustering method, relevant cells which include each of the similar data objects must be merged, instead of choosing a hyper-dividing plane. Hence, FlexDice creates neighboring links among relevant cells in every layer after dividing cells, and merges such cells including similar data objects. To reduce memory usage, FlexDice dynamically removes worthless cells, and maintains only worthwhile cells including data objects and parent cells needed for creating neighboring links of worthwhile cells. After neighboring links among worthwhile cells are created, these parent cells needed for creating neighboring links of worthwhile cells are removed from memory. In this paper we present dissimilarity between the data structure used in FlexDice and the structure of Quadtree, and show that the data structure used in FlexDice is suitable for clustering.
Keywords :
data mining; pattern clustering; spatial data structures; very large databases; FlexDice; clustering method; data mining; data structure; graph structure; irregular size cell graph; large high-dimensional datasets; quadtree; tree-structure; Clustering algorithms; Clustering methods; Data mining; Data structures; Indexing; Information technology; Large-scale systems; Spatial databases; Tree data structures; Tree graphs;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research Issues in Data Engineering: Stream Data Mining and Applications, 2005. RIDE-SDMA 2005. 15th International Workshop on
ISSN :
1097-8585
Print_ISBN :
0-7695-2390-0
Type :
conf
DOI :
10.1109/RIDE.2005.5
Filename :
1498227
Link To Document :
بازگشت