• DocumentCode
    1597364
  • Title

    A Service-Oriented Framework for Executing Data Mining Workflows on Grids

  • Author

    Lackovic, Marco ; Talia, Domenico ; Trunfio, Paolo

  • Author_Institution
    DEIS, Univ. of Calabria, Rende
  • fYear
    2009
  • Firstpage
    72
  • Lastpage
    79
  • Abstract
    Workflow environments are widely used in data mining systems to manage data and execution flows associated to complex applications. Weka, one of the most used open-source data mining systems, includes the KnowledgeFlow environment which provides a drag-and-drop interface to compose and execute data mining workflows. The Weka KnowledgeFlow allows users to execute a whole workflow only on a single computer. On the other hand, most data mining workflows include several independent branches that could be run in parallel on a set of distributed machines to reduce the overall execution time. We implemented distributed workflow execution in Weka4WS, a framework that extends Weka and its KnowledgeFlow environment to exploit distributed resources available in a Grid using Web Service technologies. In this paper we describe the Weka4WS architecture and the functionalities provided by its service-oriented KnowledgeFlow component, showing its use to compose and execute simple parallel data mining workflows. Furthermore, we present ongoing work aimed at supporting also data-parallel workflows on a Grid.
  • Keywords
    Web services; data mining; grid computing; user interfaces; workflow management software; Web service technology; Weka knowledgeflow; data management; data mining workflow execution; distributed machine; drag-and-drop interface; grid computing; knowledgeflow environment; open-source data mining system; service-oriented framework; workflow management system; Algorithm design and analysis; Clustering algorithms; Conference management; Data mining; Environmental management; Graphical user interfaces; Libraries; Open source software; Pervasive computing; Web services; Data Mining; Grid; Web Services; Weka4WS; Workflows;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Grid and Pervasive Computing Conference, 2009. GPC '09. Workshops at the
  • Conference_Location
    Geneva
  • Print_ISBN
    978-1-4244-4372-7
  • Type

    conf

  • DOI
    10.1109/GPC.2009.9
  • Filename
    4976547