DocumentCode
1761920
Title
Pattern-Aided Regression Modeling and Prediction Model Analysis
Author
Guozhu Dong ; Taslimitehrani, Vahid
Author_Institution
Dept. of Comput. Sci. & Eng., Wright State Univ., Dayton, OH, USA
Volume
27
Issue
9
fYear
2015
fDate
Sept. 1 2015
Firstpage
2452
Lastpage
2465
Abstract
This paper first introduces pattern aided regression (PXR) models, a new type of regression models designed to represent accurate and interpretable prediction models. This was motivated by two observations: (1) Regression modeling applications often involve complex diverse predictor-response relationships, which occur when the optimal regression models (of given regression model type) fitting two or more distinct logical groups of data are highly different. (2) State-of-the-art regression methods are often unable to adequately model such relationships. This paper defines PXR models using several patterns and local regression models, which respectively serve as logical and behavioral characterizations of distinct predictor-response relationships. The paper also introduces a contrast pattern aided regression (CPXR) method, to build accurate PXR models. In experiments, the PXR models built by CPXR are very accurate in general, often outperforming state-of-the-art regression methods by big margins. Usually using (a) around seven simple patterns and (b) linear local regression models, those PXR models are easy to interpret; in fact, their complexity is just a bit higher than that of (piecewise) linear regression models and is significantly lower than that of traditional ensemble based regression models. CPXR is especially effective for high-dimensional data. The paper also discusses how to use CPXR methodology for analyzing prediction models and correcting their prediction errors.
Keywords
formal logic; pattern recognition; regression analysis; CPXR method; PXR models; complex diverse predictor-response relationships; contrast pattern aided regression; linear local regression models; logical groups; prediction model analysis; Analytical models; Biological system modeling; Computational modeling; Data models; Linear regression; Predictive models; Regression tree analysis; Correlation and regression analysis; data mining; error analysis; mining methods and algorithms; model validation and analysis;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2015.2411609
Filename
7058431
Link To Document