DocumentCode
1597364
Title
A Service-Oriented Framework for Executing Data Mining Workflows on Grids
Author
Lackovic, Marco ; Talia, Domenico ; Trunfio, Paolo
Author_Institution
DEIS, Univ. of Calabria, Rende
fYear
2009
Firstpage
72
Lastpage
79
Abstract
Workflow environments are widely used in data mining systems to manage data and execution flows associated to complex applications. Weka, one of the most used open-source data mining systems, includes the KnowledgeFlow environment which provides a drag-and-drop interface to compose and execute data mining workflows. The Weka KnowledgeFlow allows users to execute a whole workflow only on a single computer. On the other hand, most data mining workflows include several independent branches that could be run in parallel on a set of distributed machines to reduce the overall execution time. We implemented distributed workflow execution in Weka4WS, a framework that extends Weka and its KnowledgeFlow environment to exploit distributed resources available in a Grid using Web Service technologies. In this paper we describe the Weka4WS architecture and the functionalities provided by its service-oriented KnowledgeFlow component, showing its use to compose and execute simple parallel data mining workflows. Furthermore, we present ongoing work aimed at supporting also data-parallel workflows on a Grid.
Keywords
Web services; data mining; grid computing; user interfaces; workflow management software; Web service technology; Weka knowledgeflow; data management; data mining workflow execution; distributed machine; drag-and-drop interface; grid computing; knowledgeflow environment; open-source data mining system; service-oriented framework; workflow management system; Algorithm design and analysis; Clustering algorithms; Conference management; Data mining; Environmental management; Graphical user interfaces; Libraries; Open source software; Pervasive computing; Web services; Data Mining; Grid; Web Services; Weka4WS; Workflows;
fLanguage
English
Publisher
ieee
Conference_Titel
Grid and Pervasive Computing Conference, 2009. GPC '09. Workshops at the
Conference_Location
Geneva
Print_ISBN
978-1-4244-4372-7
Type
conf
DOI
10.1109/GPC.2009.9
Filename
4976547
Link To Document