Title :
Cost-Sensitive Parsimonious Linear Regression
Author :
Goetschalckx, Robby ; Driessens, Kurt ; Sanner, Scott
Author_Institution :
Katholieke Univ. Leuven, Leuven
Abstract :
We examine linear regression problems where some features may only be observable at a cost (e.g., in medical domains where features may correspond to diagnostic tests that take time and costs money). This can be important in the context of data mining, in order to obtain the best predictions from the data on a limited cost budget. We define a parsimonious linear regression objective criterion that jointly minimizes prediction error and feature cost. We modify least angle regression algorithms commonly used for sparse linear regression to produce the ParLiR algorithm, which not only provides an efficient and parsimonious solution as we demonstrate empirically, but it also provides formal guarantees that we prove theoretically.
Keywords :
data mining; regression analysis; ParLiR algorithm; cost-sensitive parsimonious linear regression; data mining; parsimonious linear regression objective criterion; prediction error; sparse linear regression; Australia; Costs; Data mining; Linear regression; Machine learning; Machine learning algorithms; Measurement units; Medical diagnostic imaging; Medical tests; Sampling methods; Cost sensitivity; linear regression; regression; sparsity;
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3502-9
DOI :
10.1109/ICDM.2008.76