• DocumentCode
    1638704
  • Title

    A Middleware for Developing and Deploying Scalable Remote Mining Services

  • Author

    Glimcher, Leonid ; Agrawal, Gagan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH
  • fYear
    2008
  • Firstpage
    242
  • Lastpage
    249
  • Abstract
    In this paper, we consider the problem of developing service-oriented implementations of data-intensive applications that process data on remote servers. While the existing grid and web-service frameworks allow interoperability and flexible resource utilization, achieving efficiency and scalability remains a critical challenge. Similarly, the existing grid and web-service frameworks do not provide transparency in accessing and processing data from grid-based data servers. We present design and evaluation of a system that supports a high-level interface for developing data mining and scientific data processing grid- services and targets data residing on SRB servers. Results of our evaluation using two data mining and one scientific data processing applications show two important observations. First, each of applications we evaluated demonstrated good scalability with respect to dataset size, as well as changing numbers of both data host and compute nodes. Second, there is only a small overhead associated with deploying our middleware- based applications using MPICH-G2 and Globus. This overhead varied between 14% and 22% and is primarily because of a larger memory footprint. Thus, overall, our work shows that it is feasible to develop and deploy scalable and efficient grid-services that process data from remote servers.
  • Keywords
    Web services; data mining; grid computing; middleware; Web-service; data mining; data processing grid-services; data-intensive applications; grid-based data servers; middleware; remote servers; scalable remote mining services; service-oriented implementations; Application software; Computer architecture; Computer science; Data engineering; Data mining; Data processing; Grid computing; Middleware; Resource management; Scalability; Globus Toolkit; MPICH-G2; grid middleware; scalable remote datamining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing and the Grid, 2008. CCGRID '08. 8th IEEE International Symposium on
  • Conference_Location
    Lyon
  • Print_ISBN
    978-0-7695-3156-4
  • Electronic_ISBN
    978-0-7695-3156-4
  • Type

    conf

  • DOI
    10.1109/CCGRID.2008.36
  • Filename
    4534225