• DocumentCode
    22799
  • Title

    INFUSE: Interactive Feature Selection for Predictive Modeling of High Dimensional Data

  • Author

    Krause, Jan ; Perer, Adam ; Bertini, Enrico

  • Volume
    20
  • Issue
    12
  • fYear
    2014
  • fDate
    Dec. 31 2014
  • Firstpage
    1614
  • Lastpage
    1623
  • Abstract
    Predictive modeling techniques are increasingly being used by data scientists to understand the probability of predicted outcomes. However, for data that is high-dimensional, a critical step in predictive modeling is determining which features should be included in the models. Feature selection algorithms are often used to remove non-informative features from models. However, there are many different classes of feature selection algorithms. Deciding which one to use is problematic as the algorithmic output is often not amenable to user interpretation. This limits the ability for users to utilize their domain expertise during the modeling process. To improve on this limitation, we developed INFUSE, a novel visual analytics system designed to help analysts understand how predictive features are being ranked across feature selection algorithms, cross-validation folds, and classifiers. We demonstrate how our system can lead to important insights in a case study involving clinical researchers predicting patient outcomes from electronic medical records.
  • Keywords
    data analysis; data visualisation; pattern classification; INFUSE system; algorithmic output; classifiers; clinical researchers; cross-validation folds; electronic medical records; feature selection algorithms; high dimensional data modeling; interactive feature selection; patient outcomes; predicted outcomes; predictive modeling; predictive modeling techniques; user interpretation; visual analytics system; Algorithm design and analysis; Data models; Data visualization; Feature extraction; Prediction algorithms; Predictive models; Predictive modeling; classification; feature selection; high-dimensional data; visual analytics;
  • fLanguage
    English
  • Journal_Title
    Visualization and Computer Graphics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1077-2626
  • Type

    jour

  • DOI
    10.1109/TVCG.2014.2346482
  • Filename
    6876047