DocumentCode
3112548
Title
Parametric classification over multiple samples
Author
Russo, Barbara
Author_Institution
Fac. of Comput. Sci., Free Univ. of Bozen-Bolzano, Bolzano, Italy
fYear
2013
fDate
21-21 May 2013
Firstpage
23
Lastpage
25
Abstract
This pattern was originally designed to classify sequences of events in log files by error-proneness. Sequences of events trace application use in real contexts. As such, identifying error-prone sequences helps understand and predict application use. The classification problem we describe is typical in supervised machine learning, but the composite pattern we propose investigates it with several techniques to control for data brittleness. Data pre-processing, feature selection, parametric classification, and cross-validation are the major instruments that enable a good degree of control over this classification problem. In particular, the pattern includes a solution for typical problems that occurs when data comes from several samples of different populations and with different degree of sparcity.
Keywords
learning (artificial intelligence); pattern classification; classification problem; cross-validation; data pre-processing; error-prone sequences; feature selection; parametric classification; supervised machine learning; Accuracy; Correlation; Sociology; Software; Training; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Analysis Patterns in Software Engineering (DAPSE), 2013 1st International Workshop on
Conference_Location
San Francisco, CA
Type
conf
DOI
10.1109/DAPSE.2013.6603805
Filename
6603805
Link To Document