DocumentCode :
2797742
Title :
Solving Credit Scoring Problem with Ensemble Learning: A Case Study
Author :
Xie, Hongrui ; Han, Shuli ; Shu, Xinyi ; Yang, Xinzhu ; Qu, Xiuyun ; Zheng, Shiqiang
Author_Institution :
Dept. of Autom., Tsinghua Univ., Shenzhen, China
Volume :
1
fYear :
2009
fDate :
Nov. 30 2009-Dec. 1 2009
Firstpage :
51
Lastpage :
54
Abstract :
Managing customer credit is an important issue in the banking industry and should always be done in an automatic way, with credit scoring trusted. This paper presents our solution to PAKDD 2009 data mining competition as a case study of the credit scoring problem. Following a brief description of the data mining task, several challenges confronted in the task such as imbalanced dataset, missing values and data transformation are discussed. After series of preliminary experiments, logistic regression and AdaBoost were shown as the resulting classifiers on this particular problem. Furthermore, an ensemble of the two classifiers was created in order to achieve even better performance. The final result shows that our solution is effective and efficient with an AUC value of 0.6535, which was the fifth best result among more than 100 competitive teams.
Keywords :
banking; data mining; regression analysis; AdaBoost; PAKDD 2009 data mining competition; banking industry; credit scoring problem; data mining task; data transformation; ensemble learning; imbalanced dataset; logistic regression; missing values; Automation; Banking; Data mining; Data preprocessing; Knowledge acquisition; Logistics; Pattern analysis; Performance analysis; Predictive models; Testing; credit scoring; data mining; ensemble method; imbalance processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Knowledge Acquisition and Modeling, 2009. KAM '09. Second International Symposium on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3888-4
Type :
conf
DOI :
10.1109/KAM.2009.241
Filename :
5362208
Link To Document :
بازگشت