• DocumentCode
    1761920
  • Title

    Pattern-Aided Regression Modeling and Prediction Model Analysis

  • Author

    Guozhu Dong ; Taslimitehrani, Vahid

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Wright State Univ., Dayton, OH, USA
  • Volume
    27
  • Issue
    9
  • fYear
    2015
  • fDate
    Sept. 1 2015
  • Firstpage
    2452
  • Lastpage
    2465
  • Abstract
    This paper first introduces pattern aided regression (PXR) models, a new type of regression models designed to represent accurate and interpretable prediction models. This was motivated by two observations: (1) Regression modeling applications often involve complex diverse predictor-response relationships, which occur when the optimal regression models (of given regression model type) fitting two or more distinct logical groups of data are highly different. (2) State-of-the-art regression methods are often unable to adequately model such relationships. This paper defines PXR models using several patterns and local regression models, which respectively serve as logical and behavioral characterizations of distinct predictor-response relationships. The paper also introduces a contrast pattern aided regression (CPXR) method, to build accurate PXR models. In experiments, the PXR models built by CPXR are very accurate in general, often outperforming state-of-the-art regression methods by big margins. Usually using (a) around seven simple patterns and (b) linear local regression models, those PXR models are easy to interpret; in fact, their complexity is just a bit higher than that of (piecewise) linear regression models and is significantly lower than that of traditional ensemble based regression models. CPXR is especially effective for high-dimensional data. The paper also discusses how to use CPXR methodology for analyzing prediction models and correcting their prediction errors.
  • Keywords
    formal logic; pattern recognition; regression analysis; CPXR method; PXR models; complex diverse predictor-response relationships; contrast pattern aided regression; linear local regression models; logical groups; prediction model analysis; Analytical models; Biological system modeling; Computational modeling; Data models; Linear regression; Predictive models; Regression tree analysis; Correlation and regression analysis; data mining; error analysis; mining methods and algorithms; model validation and analysis;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2015.2411609
  • Filename
    7058431