• DocumentCode
    2586579
  • Title

    Optimizing Distributed Data Mining Applications Based on Object Clustering Methods

  • Author

    Fiolet, V. ; Laskowski, E. ; Olejnik, R. ; Masko, L. ; Toursel, B. ; Tudruj, M.

  • Author_Institution
    Lab. d´´Informatique Fondamentale de Lille, Univ. des Sci. et Technol. de Lille
  • fYear
    2006
  • fDate
    13-17 Sept. 2006
  • Firstpage
    257
  • Lastpage
    262
  • Abstract
    The exponential computational cost involved in traditional data mining methods enforces search for less complex new algorithms. Especially, data mining on grid is a challenge due to the lack of shared memory in grid computing, which puts special attention to communication optimization. The aim of the DisDaMin project (distributed data mining), described in the paper, is solving data mining problems by using new distributed algorithms intented for execution in grid environments. The DisDaMin implements intelligent fragmentation of data by clustering methods and asynchronous collaborative processing adjusted to grid environments. The DG-ADAJ environment provides adaptive control of distributed applications written in Java for desktop grid. It constitutes a component-based middleware, which allows for optimized distribution of applications on clusters of Java virtual machines, monitoring of application execution and dynamic online balancing of processing and communication. The DG-ADAJ system provides a middleware platform for desktop grid that could be used as a deployment base for DisDaMin algorithms. In this paper, we propose static object placement optimization algorithms for fragmentation of data in the DisDaMin project. The algorithms use DG-ADAJ´s object clustering methods to provide optimized local processing on each node with minimized inter-node communication
  • Keywords
    Java; data mining; distributed algorithms; grid computing; optimisation; pattern clustering; virtual machines; DG-ADAJ environment; DisDaMin project; Java virtual machines; adaptive control; desktop grid; distributed algorithms; distributed data mining applications; grid computing; minimized inter-node communication; object clustering methods; static object placement optimization algorithms; Clustering algorithms; Clustering methods; Computational efficiency; Data mining; Distributed algorithms; Grid computing; Java; Machine intelligence; Middleware; Optimization methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Computing in Electrical Engineering, 2006. PAR ELEC 2006. International Symposium on
  • Conference_Location
    Bialystok
  • Print_ISBN
    0-7695-2554-7
  • Type

    conf

  • DOI
    10.1109/PARELEC.2006.57
  • Filename
    1698670