• DocumentCode
    774320
  • Title

    Sampling strategies for mining in data-scarce domains

  • Author

    Ramakrishnan, N. ; Bailey-Kellogg, Chris

  • Author_Institution
    Virginia Tech, VA, USA
  • Volume
    4
  • Issue
    4
  • fYear
    2002
  • Firstpage
    31
  • Lastpage
    43
  • Abstract
    A novel framework leverages physical properties for mining in data-scarce domains. It interleaves bottom-up data mining with top-down data collection, leading to effective and explainable sampling strategies. This article describes focused sampling strategies for mining scientific data. Our approach is based on the spatial aggregation language, which supports construction of data interpretation and control design applications for spatially distributed physical systems in a bottom-up manner. Used as a basis for describing data mining algorithms, SAL programs also help exploit knowledge of physical properties such as continuity and locality in data fields. We also introduce a top-down sampling strategy that focuses data collection in only those regions that are deemed most important to support a data mining objective.
  • Keywords
    data acquisition; data mining; eigenvalues and eigenfunctions; natural sciences computing; optimisation; sampling methods; data acquisition; data collection; data mining; data-scarce domains; eigenvalues; optimization; scientific data; spatial aggregation language; top-down sampling; Aerodynamics; Analytical models; Computational modeling; Data engineering; Data mining; Design engineering; Distributed computing; Process design; Propulsion; Sampling methods;
  • fLanguage
    English
  • Journal_Title
    Computing in Science & Engineering
  • Publisher
    ieee
  • ISSN
    1521-9615
  • Type

    jour

  • DOI
    10.1109/MCISE.2002.1014978
  • Filename
    1014978