DocumentCode :
1537349
Title :
Data mining: from serendipity to science
Author :
Ramakrishnan, Naren ; Grama, Ananth Y.
Author_Institution :
Dept. of Comput. Sci., Virginia Polytech. Inst. & State Univ., Blacksburg, VA, USA
Volume :
32
Issue :
8
fYear :
1999
fDate :
8/1/1999 12:00:00 AM
Firstpage :
34
Lastpage :
37
Abstract :
The idea of unsupervised learning from basic facts (axioms) or from data has fascinated researchers for decades. Knowledge discovery engines try to extract general inferences from facts or training data. Statistical methods take a more structured approach, attempting to quantify data by known and intuitively understood models. The problem of gleaning knowledge from existing data sources poses a significant paradigm shift from these traditional approaches. The size, noise, diversity, dimensionality, and distributed nature of typical data sets make even formal problem specification difficult. Moreover, you typically do not have control over data generation. This lack of control opens up a Pandora´s box filled with issues such as overfitting, limited coverage, and missing/incorrect data with high dimensionality. Once specified, solution techniques must deal with complexity, scalability (to meaningful data sizes), and presentation. This entire process is where data mining makes its transition from serendipity to science
Keywords :
data mining; complexity; data generation; data mining; data sources; facts; inference; knowledge discovery engines; presentation; scalability; statistical methods; training data; unsupervised learning; Artificial intelligence; Data mining; Engines; Large-scale systems; Machine learning; Machine learning algorithms; Scalability; Statistical analysis; Training data; Unsupervised learning;
fLanguage :
English
Journal_Title :
Computer
Publisher :
ieee
ISSN :
0018-9162
Type :
jour
DOI :
10.1109/2.781632
Filename :
781632
Link To Document :
بازگشت