• DocumentCode
    3112548
  • Title

    Parametric classification over multiple samples

  • Author

    Russo, Barbara

  • Author_Institution
    Fac. of Comput. Sci., Free Univ. of Bozen-Bolzano, Bolzano, Italy
  • fYear
    2013
  • fDate
    21-21 May 2013
  • Firstpage
    23
  • Lastpage
    25
  • Abstract
    This pattern was originally designed to classify sequences of events in log files by error-proneness. Sequences of events trace application use in real contexts. As such, identifying error-prone sequences helps understand and predict application use. The classification problem we describe is typical in supervised machine learning, but the composite pattern we propose investigates it with several techniques to control for data brittleness. Data pre-processing, feature selection, parametric classification, and cross-validation are the major instruments that enable a good degree of control over this classification problem. In particular, the pattern includes a solution for typical problems that occurs when data comes from several samples of different populations and with different degree of sparcity.
  • Keywords
    learning (artificial intelligence); pattern classification; classification problem; cross-validation; data pre-processing; error-prone sequences; feature selection; parametric classification; supervised machine learning; Accuracy; Correlation; Sociology; Software; Training; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Analysis Patterns in Software Engineering (DAPSE), 2013 1st International Workshop on
  • Conference_Location
    San Francisco, CA
  • Type

    conf

  • DOI
    10.1109/DAPSE.2013.6603805
  • Filename
    6603805