مرکز منطقه ای اطلاع رساني علوم و فناوري - Fast data sampling for large scale support vector machines

DocumentCode :

3778305

Title :

Fast data sampling for large scale support vector machines

Author :

Rahul K. Sevakula;Mohammed Suhail;Nishchal K. Verma

Author_Institution :

Department of Electrical Engineering, Indian Institute of Technology Kanpur, India

fYear :

2015

Firstpage :

Lastpage :

Abstract :

Traditional algorithms for training the Support Vector Machines (SVMs) have a worst case time complexity of O(n3) and a space complexity of O(n2). This makes it difficult to scale the training algorithm for large scale datasets. In this paper, three algorithms have been proposed for reducing the training dataset. The algorithms mine the potential support vectors based on closeness to decision boundary information and use only them for learning the hyper-plane. The algorithms use spatial distribution descriptors such as median and quartiles to realize the closeness of data points to boundary. Initially a distance based algorithm is proposed for linear SVM, and later the same is extended for kernel SVM using projection vectors. The proposed data sampling algorithms have a time complexity of O(n). On experimentation, the algorithms are found to drastically reduce the number of training samples and accordingly reduce the training time of SVM, and in general, much compromise is not seen in classification accuracy.

Keywords :

"Training","Support vector machines","Time complexity","Kernel","Training data","Optimization"

Publisher :

ieee

Conference_Titel :

Computational Intelligence: Theories, Applications and Future Directions (WCI), 2015 IEEE Workshop on

Type :

conf

DOI :

10.1109/WCI.2015.7495509

Filename :

7495509

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3778305