• DocumentCode
    1883522
  • Title

    Distributed Data Mining on Virtual Clusters

  • Author

    Mateescu, Gabriel ; Valdés, Julio

  • Author_Institution
    Research Computing Support Group, Canada
  • fYear
    2006
  • fDate
    14-17 May 2006
  • Firstpage
    6
  • Lastpage
    6
  • Abstract
    For complex processes investigated in scientific fields such as medicine and earth sciences, knowledge discovery that exposes the underlying structure of the processes is crucial for detecting changes of state and constructing forecasting procedures. A model discovery approach has been recently developed, which uses computational intelligence techniques to deal with the heterogeneity, incompleteness and imprecision of the data describing complex proceeses. While the approach offers a tractable and effective means for model discovery, it is still computationally expensive, routinely requiring tens of thousands of hours of CPU time. To satisfy the needs of such applications, it is more costeffective to employ shared resources located in different departments of an organization, than to purchase large and expensive compute clusters. We present a method for aggregating resources under multiple administrative domains into a virtual resource that can satisfy efficiently the needs of data-mining based model discovery. The proposed resource aggregation and job management approach provides an end-to-end solution to distributed data mining across organization-wide resources.
  • Keywords
    Computational intelligence; Computer architecture; Councils; Data mining; Drives; Economic forecasting; Geoscience; Identity management systems; Information technology; Resource management;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Performance Computing in an Advanced Collaborative Environment, 2006. HPCS 2006. 20th International Symposium on
  • ISSN
    1550-5243
  • Print_ISBN
    0-7695-2582-2
  • Type

    conf

  • DOI
    10.1109/HPCS.2006.20
  • Filename
    1628197