Title of article :
Some statistical issues related to multiple linear regression modeling of beach bacteria concentrations
Author/Authors :
Zhongfu Ge، نويسنده , , Walter E. Frick، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2007
Pages :
7
From page :
358
To page :
364
Abstract :
As a fast and effective technique, the multiple linear regression (MLR) method has been widely used in modeling and prediction of beach bacteria concentrations. Among previous works on this subject, however, several issues were insufficiently or inconsistently addressed. Those issues include the value and use of interaction terms, the serial correlation, the criteria for model selection, and model assessment. The present work shows that serial correlations, as often present in sequentially observed data records, deserve full attention from the modeler. The testing and adjustment for the time-series effect should be implemented in a statistically rigorous framework. The R2 and Cp-statistic as joint criteria are recommended for the model selection process, while using the t-statistics associated with the full model is erroneous. During model selection, using interaction terms can often help to decrease the bias in reduced models, although the resulting improvement in the numerical performance may be limited. For the assessment of the model predictive capacity, which is different from testing the goodness of fit, a comprehensive set of statistics are advocated to allow for an objective evaluation of different models. Results obtained from the data at Huntington Beach, OH, show that erroneous conclusions could be drawn if only the model R2 and the count of type I and type II errors are considered. In this sense, several previous works deserve further investigation.
Keywords :
multiple linear regression , Model selection , model evaluation , prediction , Empirical
Journal title :
Environmental Research
Serial Year :
2007
Journal title :
Environmental Research
Record number :
728467
Link To Document :
بازگشت