DocumentCode :
1911355
Title :
An Improved Method for Combination Feature Selection in Web Click-Through Data Mining
Author :
Hongwei Zhao ; Yongfeng Huang
Author_Institution :
Dept. of Electron. & Eng., Tsinghua Univ., Beijing, China
fYear :
2012
fDate :
14-16 Dec. 2012
Firstpage :
381
Lastpage :
385
Abstract :
An important way to analyze the web click-through data is to build up a 2-class linear classifier, and select a key subset from user\´s features which mainly decided the hit result. But in many circumstances, the fitting accuracy is not good as the model only considers original features. We often add combination features which are products of the original features to the classifier model to improve the accuracy. Meanwhile, the combination features cause a serious problem. They dramatically increase the number of features, which is called "feature dimension explosion". Traditional algorithms can hardly afford this because they need to input all the features at the beginning of processing. Grafting method provides an incremental way to solve the problem, which only adds one feature at a time. However, Grafting method has very low efficiency when the dimension of the feature space is huge and sparse. In this paper, we propose an adaptive Grafting algorithm and PV filter method to solve the feature dimension explosion problem. Our algorithm significantly improves the computational efficiency by educing the steps of model optimizing, and reduces the scale of feature space by applying a very simple filter strategy to make the algorithm work effectively. Our experiments on real data show that we can easily generate and select ombination features by using the adaptive Grafting algorithm and PV filter method, which significantly raises the fitting accuracy of the model.
Keywords :
Internet; data mining; pattern classification; 2-class linear classifier; PV filter method; Web click-through data mining; adaptive grafting algorithm; combination feature selection; feature dimension explosion; Grafting; click-through data; feature combination; feature selction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science and Engineering (ISISE), 2012 International Symposium on
Conference_Location :
Shanghai
ISSN :
2160-1283
Print_ISBN :
978-1-4673-5680-0
Type :
conf
DOI :
10.1109/ISISE.2012.92
Filename :
6495369
Link To Document :
بازگشت