DocumentCode
774320
Title
Sampling strategies for mining in data-scarce domains
Author
Ramakrishnan, N. ; Bailey-Kellogg, Chris
Author_Institution
Virginia Tech, VA, USA
Volume
4
Issue
4
fYear
2002
Firstpage
31
Lastpage
43
Abstract
A novel framework leverages physical properties for mining in data-scarce domains. It interleaves bottom-up data mining with top-down data collection, leading to effective and explainable sampling strategies. This article describes focused sampling strategies for mining scientific data. Our approach is based on the spatial aggregation language, which supports construction of data interpretation and control design applications for spatially distributed physical systems in a bottom-up manner. Used as a basis for describing data mining algorithms, SAL programs also help exploit knowledge of physical properties such as continuity and locality in data fields. We also introduce a top-down sampling strategy that focuses data collection in only those regions that are deemed most important to support a data mining objective.
Keywords
data acquisition; data mining; eigenvalues and eigenfunctions; natural sciences computing; optimisation; sampling methods; data acquisition; data collection; data mining; data-scarce domains; eigenvalues; optimization; scientific data; spatial aggregation language; top-down sampling; Aerodynamics; Analytical models; Computational modeling; Data engineering; Data mining; Design engineering; Distributed computing; Process design; Propulsion; Sampling methods;
fLanguage
English
Journal_Title
Computing in Science & Engineering
Publisher
ieee
ISSN
1521-9615
Type
jour
DOI
10.1109/MCISE.2002.1014978
Filename
1014978
Link To Document