• DocumentCode
    1687745
  • Title

    Data mining on the grid for the grid

  • Author

    Chawla, Nitesh V. ; Thain, Douglas ; Lichtenwalter, Ryan ; Cieslak, David A.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, IN
  • fYear
    2008
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Both users and administrators of computing grids are presented with enormous challenges in debugging and troubleshooting. Diagnosing a problem with one application on one machine is hard enough, but diagnosing problems in workloads of millions of jobs running on thousands of machines is a problem of a new order of magnitude. Suppose that a user submits one million jobs to a grid, only to discover some time later that half of them have failed, Users of large scale systems need tools that describe the overall situation, indicating what problems are commonplace versus occasional, and which are deterministic versus random. Machine learning techniques can be used to debug these kinds of problems in large scale systems. We present a comprehensive framework from data to knowledge discovery as an important step towards achieving this vision.
  • Keywords
    data mining; grid computing; learning (artificial intelligence); program debugging; data mining; debugging; grid computing; machine learning techniques; troubleshooting; Application software; Computer science; Data engineering; Data mining; Debugging; Grid computing; Large-scale systems; Optical packet switching; Switching circuits; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
  • Conference_Location
    Miami, FL
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4244-1693-6
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2008.4536427
  • Filename
    4536427