Title :
Practical Optimizations for Perceptron Algorithms in Large Malware Dataset
Author :
Gavrilut, Dragos ; Benchea, R. ; Vatamanu, Cristina
Author_Institution :
Romania Bitdefender Anti-virus Res. Lab., Al. I. Cuza Univ. of Iasi, Iasi, Romania
Abstract :
Due to the increasing number of malware samples in the past 4 years, machine learning algorithms emerged as an important tool in automated malware detection. This approach to create the detection model requires, however, a lot of time with a continually growing data-set. Often changes in malware families and the increasing training time makes the model less efficient and increases the probability of false alarms. This paper approaches this matter by reducing the time needed to create a detection model on very large databases and suggests three different optimization techniques. First, the perceptron algorithm was adjusted to use the map-reduce paradigm in order to make it run in a distribute manner. Second, hardware specific optimizations were applied for faster mathematical computations. Finally, a cache system was used to reduce the quantity of data processed by the algorithm. Even if these methods were designed and tested for malware databases they can easily be adjusted for other databases as well.
Keywords :
cache storage; invasive software; learning (artificial intelligence); perceptrons; very large databases; automated malware detection; cache system; data processing; false alarm probability; large malware dataset; machine learning algorithm; malware database; malware family; map reduce paradigm; mathematical computation; perceptron algorithm; practical optimization; very large databases; Adaptation models; Algorithm design and analysis; Databases; Machine learning algorithms; Malware; Optimization; Training; Perceptron; algorithm; database; optimization;
Conference_Titel :
Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2012 14th International Symposium on
Conference_Location :
Timisoara
Print_ISBN :
978-1-4673-5026-6
DOI :
10.1109/SYNASC.2012.33