شماره ركورد كنفرانس :
5318
عنوان مقاله :
Alternating Conditional Expectation (ACE) Algorithm for robust regression analysis of simulated dataset in the presence of Outliers
پديدآورندگان :
Khanmohammadi Khorrami Mohammadreza m.khanmohammadi@sci.ikiu.ac.ir Department of Chemistry, Faculty of Science, Imam Khomeini International University, Qazvin, Iran , Mohammadi Mahsa Department of Chemistry, Faculty of Science, Imam Khomeini International University, Qazvin, Iran , S.Hajiseyedrazi Zahra Department of Chemistry, Faculty of Science, Imam Khomeini International University, Qazvin, Iran
تعداد صفحه :
1
كليدواژه :
SLS , SM , RM , LMS , Jacobian , AVAS , ACE
سال انتشار :
1402
عنوان كنفرانس :
نهمين سمينار ملي دوسالانه كمومتريكس ايران
زبان مدرك :
انگليسي
چكيده فارسي :
The purpose of the univariate regression model is to discover the relationship between response and descriptor variables, which means accurately calculating the regression coefficients. The presence of random noise and outlier data is recognized as important sources of uncertainty in regression models. Outliers are data points with high residual values compared to other points, leading to abnormalities in the regression process. This issue becomes more challenging when the dataset becomes smaller. Two strategies have been taken into account to deal with data contaminated with outliers. Firstly, using outlier detection procedures (such as Dixon s test, Grubb s test, Cook s squared distance, squared Mahalanobis distance, etc.), and secondly, utilizing robust methods (median-based techniques, Jacobian matrix method, Additivity and Variance Stabilization (AVAS), and Alternating Conditional Expectation (ACE)) [1-3]. This study investigated the effect of outlier data on the performance of various regression models, such as simple least squares (SLS), median-based methods like Single Median (SM), Repeated Median (RM), and Least Median of Squares (LMS) methods, as well as the Jacobian matrix technique, AVAS, and ACE. Six pairs of data points were defined as raw data, and the presence of an outlier in the response variable was investigated. For the dataset with outliers, SLS, as a representative of classical regression methods, had weak results. Also, the efficiency of median-based approaches was not good. The results of the Jacobian method were not desirable, which may stem from not defining the initial equation for this model. The performance of AVAS was broadly satisfying, but ACE was the best procedure. The final statistical results of ACE were an R^2 of 1.000, R^2adj of 1.000, sum of squares (SSE) of 7.32E- 30, and the variance inflation factor (VIF) with an infinity value. Selecting the best transformation function with the highest correlation coefficient between the response and descriptor variables and stabilizing the error variance distribution are important advantages of ACE that lead to the best results [4]
كشور :
ايران
لينک به اين مدرک :
بازگشت