Author/Authors :
Mohseni, Navid Department of Computer Engineering - Babol Branch, Islamic Azad University, Babol, Iran , Nematzadeh, Hossein Department of Computer Engineering - Sari Branch, Islamic Azad University, Sari, Iran , Akbari, Ebrahim Department of Computer Engineering - Sari Branch, Islamic Azad University, Sari, Iran
Abstract :
Outlier detection is a technique for recognizing samples out of the main population within a data set. Outliers have negative impacts on classification. The recognized outliers are deleted to improve the classification power generally. This paper proposes a method for outlier detection in test samples besides a supervised training set selection. Training set selection is done based on the intersection of three well known similarity measures namely, jacquard, cosine, and dice. Each test sample is evaluated against the selected training set for possible outlier detection. The selected training set is used for a two-stage classification. The accuracy of classifiers are increased after outlier deletion. The majority voting function is used for further improvement of classifiers.