Title :
Clustering algorithms for content-based publication-subscription systems
Author :
Riabov, Anton ; Liu, Zhen ; Wolf, Joel L. ; Yu, Philip S. ; Zhang, Li
Author_Institution :
IEOR Dept., Columbia Univ., New York, NY, USA
Abstract :
We consider efficient communication schemes based on both network-supported and application-level multicast techniques for content-based publication-subscription systems. We show that the communication costs depend heavily on the network configurations, distribution of publications and subscriptions. We devise new algorithms and adapt existing partitional data clustering algorithms. These algorithms can be used to determine multicast groups with as much commonality as possible, based on the totality of subscribers´ interests. They perform well in the context of highly heterogeneous subscriptions, and they also scale well. An efficiency of 60% to 80% with respect to the ideal solution can be achieved with a small number of multicast groups (less than 100 in our experiments). Some of these same concepts can be applied to match publications to subscribers in real-time, and also to determine dynamically whether to unicast, multicast or broadcast information about the events over the network to the matched subscribers. We demonstrate the quality of our algorithms via simulation experiments.
Keywords :
content-based retrieval; electronic publishing; information dissemination; multicast communication; pattern clustering; application-level multicast techniques; broadcast; communication costs; content-based publication-subscription systems; efficient communication schemes; multicast groups; network configurations; network-supported multicast techniques; partitional data clustering algorithms; publication distribution; simulation; subscription distribution; unicast; Broadcasting; Clustering algorithms; Costs; Discrete event simulation; Electronic mail; Multicast algorithms; Partitioning algorithms; Stock markets; Subscriptions; Unicast;
Conference_Titel :
Distributed Computing Systems, 2002. Proceedings. 22nd International Conference on
Print_ISBN :
0-7695-1585-1
DOI :
10.1109/ICDCS.2002.1022250