DocumentCode :
1597364
Title :
A Service-Oriented Framework for Executing Data Mining Workflows on Grids
Author :
Lackovic, Marco ; Talia, Domenico ; Trunfio, Paolo
Author_Institution :
DEIS, Univ. of Calabria, Rende
fYear :
2009
Firstpage :
72
Lastpage :
79
Abstract :
Workflow environments are widely used in data mining systems to manage data and execution flows associated to complex applications. Weka, one of the most used open-source data mining systems, includes the KnowledgeFlow environment which provides a drag-and-drop interface to compose and execute data mining workflows. The Weka KnowledgeFlow allows users to execute a whole workflow only on a single computer. On the other hand, most data mining workflows include several independent branches that could be run in parallel on a set of distributed machines to reduce the overall execution time. We implemented distributed workflow execution in Weka4WS, a framework that extends Weka and its KnowledgeFlow environment to exploit distributed resources available in a Grid using Web Service technologies. In this paper we describe the Weka4WS architecture and the functionalities provided by its service-oriented KnowledgeFlow component, showing its use to compose and execute simple parallel data mining workflows. Furthermore, we present ongoing work aimed at supporting also data-parallel workflows on a Grid.
Keywords :
Web services; data mining; grid computing; user interfaces; workflow management software; Web service technology; Weka knowledgeflow; data management; data mining workflow execution; distributed machine; drag-and-drop interface; grid computing; knowledgeflow environment; open-source data mining system; service-oriented framework; workflow management system; Algorithm design and analysis; Clustering algorithms; Conference management; Data mining; Environmental management; Graphical user interfaces; Libraries; Open source software; Pervasive computing; Web services; Data Mining; Grid; Web Services; Weka4WS; Workflows;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Grid and Pervasive Computing Conference, 2009. GPC '09. Workshops at the
Conference_Location :
Geneva
Print_ISBN :
978-1-4244-4372-7
Type :
conf
DOI :
10.1109/GPC.2009.9
Filename :
4976547
Link To Document :
بازگشت