مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2976272

Title :

Clustering for classification

Author :

Evans, Reuben ; Pfahringer, Bernhard ; Holmes, Geoffrey

Author_Institution :

Comput. Sci. Dept., Univ. of Waikato, Hamilton, New Zealand

fYear :

2011

fDate :

12-13 July 2011

Firstpage :

Lastpage :

Abstract :

Advances in technology have provided industry with an array of devices for collecting data. The frequency and scale of data collection means that there are now many large datasets being generated. To find patterns in these datasets it would be useful to be able to apply modern methods of classification such as support vector machines. Unfortunately these methods are computationally expensive, quadratic in the number of data points in fact, so cannot be applied directly. This paper proposes a framework whereby a variety of clustering methods can be used to summarise datasets, that is, reduce them to a smaller but still representative dataset so that advanced methods can be applied. It compares the results of using this framework against using random selection on a large number of classification problems. Results show that clustering prior to classification is beneficial when employing a sophisticated classifier however when the classifier is simple the benefits over random selection are not justified given the added cost of clustering. The results also show that for each dataset it is important to choose a clustering method carefully.

Keywords :

pattern classification; pattern clustering; classification methods; clustering methods; random selection; support vector machines; Accuracy; Classification algorithms; Clustering algorithms; Clustering methods; Linear regression; Numerical models; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information Technology in Asia (CITA 11), 2011 7th International Conference on

Conference_Location :

Kuching, Sarawak

Print_ISBN :

978-1-61284-128-1

Electronic_ISBN :

978-1-61284-130-4

Type :

conf

DOI :

10.1109/CITA.2011.5998839

Filename :

5998839

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2976272