مرکز منطقه ای اطلاع رساني علوم و فناوري - SSDT: a scalable subspace-splitting classifier for biased data

DocumentCode :

2335406

Title :

SSDT: a scalable subspace-splitting classifier for biased data

Author :

Wang, Haixun ; Yu, Philip S.

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

fYear :

2001

fDate :

2001

Firstpage :

542

Lastpage :

549

Abstract :

Decision trees are extensively used data mining models. Recently, a number of efficient, scalable algorithms for constructing decision trees on large disk-resident datasets have been introduced. In this paper, we study the problem of learning scalable decision trees from datasets with biased class distribution. Our objective is to build decision trees that are more concise and more interpretable while maintaining the scalability of the model. To achieve this, our approach searches for subspace clusters of data cases of the biased class to enable multivariate splittings based on weighted distances to such clusters. In order to build concise and interpretable models, other approaches including multivariate decision trees and association rules, often introduce scalability and performance issues. The SSDT algorithm we present achieves the objective without loss in efficiency, scalability, and accuracy

Keywords :

data mining; decision trees; learning (artificial intelligence); pattern classification; SSDT; association rules; biased class distribution; biased data; data mining models; disk-resident dataset; efficient scalable algorithms; multivariate splittings; performance; scalability; scalable decision tree learning; scalable subspace-splitting classifier; subspace cluster search; weighted distances; Association rules; Clustering algorithms; Data mining; Decision trees; Intrusion detection; Partitioning algorithms; Predictive models; Scalability; Testing; Training data;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on

Conference_Location :

San Jose, CA

Print_ISBN :

0-7695-1119-8

Type :

conf

DOI :

10.1109/ICDM.2001.989563

Filename :

989563

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2335406