• DocumentCode
    1594503
  • Title

    Practical Optimizations for Perceptron Algorithms in Large Malware Dataset

  • Author

    Gavrilut, Dragos ; Benchea, R. ; Vatamanu, Cristina

  • Author_Institution
    Romania Bitdefender Anti-virus Res. Lab., Al. I. Cuza Univ. of Iasi, Iasi, Romania
  • fYear
    2012
  • Firstpage
    240
  • Lastpage
    246
  • Abstract
    Due to the increasing number of malware samples in the past 4 years, machine learning algorithms emerged as an important tool in automated malware detection. This approach to create the detection model requires, however, a lot of time with a continually growing data-set. Often changes in malware families and the increasing training time makes the model less efficient and increases the probability of false alarms. This paper approaches this matter by reducing the time needed to create a detection model on very large databases and suggests three different optimization techniques. First, the perceptron algorithm was adjusted to use the map-reduce paradigm in order to make it run in a distribute manner. Second, hardware specific optimizations were applied for faster mathematical computations. Finally, a cache system was used to reduce the quantity of data processed by the algorithm. Even if these methods were designed and tested for malware databases they can easily be adjusted for other databases as well.
  • Keywords
    cache storage; invasive software; learning (artificial intelligence); perceptrons; very large databases; automated malware detection; cache system; data processing; false alarm probability; large malware dataset; machine learning algorithm; malware database; malware family; map reduce paradigm; mathematical computation; perceptron algorithm; practical optimization; very large databases; Adaptation models; Algorithm design and analysis; Databases; Machine learning algorithms; Malware; Optimization; Training; Perceptron; algorithm; database; optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2012 14th International Symposium on
  • Conference_Location
    Timisoara
  • Print_ISBN
    978-1-4673-5026-6
  • Type

    conf

  • DOI
    10.1109/SYNASC.2012.33
  • Filename
    6481036