Title of article :
An experimental investigation of the impact of aggregation on the performance of data mining with logistic regression
Author/Authors :
Adam Fadlalla، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2005
Pages :
13
From page :
695
To page :
707
Abstract :
We studied the impact of data aggregation on the performance of logistic regression on predicting the direction of the Dow Jones industrial average (DJIA) stock market index. Data aggregation is a common operation in business, science, engineering, medicine, etc.; it is performed for purposes such as statistical, financial, and sales and marketing analysis — particularly within the context of a data warehouse. We showed experimentally that, for this example, as long as aggregation does not shrink the sample size unduly, it does not significantly impair the performance of the logistic regression model for predicting the direction of the DJIA stock market index. We also observed that aggregation-based models are simpler (less over-parameterized) than detail-based models. We used the receiver operating characteristic (ROC) analysis to evaluate the robustness of such predictive models. Specifically, we used the area under the ROC curve as a summary measure of the overall performance of a given model.
Keywords :
Aggregation , Prediction , logistic regression , Predictive modeling , DJIA , ROC , Area under the ROC curve , Model performance , Data warehouse , DATA MINING , Model assessment
Journal title :
Information and Management
Serial Year :
2005
Journal title :
Information and Management
Record number :
1226646
Link To Document :
بازگشت