DocumentCode
1883522
Title
Distributed Data Mining on Virtual Clusters
Author
Mateescu, Gabriel ; Valdés, Julio
Author_Institution
Research Computing Support Group, Canada
fYear
2006
fDate
14-17 May 2006
Firstpage
6
Lastpage
6
Abstract
For complex processes investigated in scientific fields such as medicine and earth sciences, knowledge discovery that exposes the underlying structure of the processes is crucial for detecting changes of state and constructing forecasting procedures. A model discovery approach has been recently developed, which uses computational intelligence techniques to deal with the heterogeneity, incompleteness and imprecision of the data describing complex proceeses. While the approach offers a tractable and effective means for model discovery, it is still computationally expensive, routinely requiring tens of thousands of hours of CPU time. To satisfy the needs of such applications, it is more costeffective to employ shared resources located in different departments of an organization, than to purchase large and expensive compute clusters. We present a method for aggregating resources under multiple administrative domains into a virtual resource that can satisfy efficiently the needs of data-mining based model discovery. The proposed resource aggregation and job management approach provides an end-to-end solution to distributed data mining across organization-wide resources.
Keywords
Computational intelligence; Computer architecture; Councils; Data mining; Drives; Economic forecasting; Geoscience; Identity management systems; Information technology; Resource management;
fLanguage
English
Publisher
ieee
Conference_Titel
High-Performance Computing in an Advanced Collaborative Environment, 2006. HPCS 2006. 20th International Symposium on
ISSN
1550-5243
Print_ISBN
0-7695-2582-2
Type
conf
DOI
10.1109/HPCS.2006.20
Filename
1628197
Link To Document