• DocumentCode
    2348865
  • Title

    Performance evaluation of a machine learning algorithm for early application identification

  • Author

    Verticale, Giacomo ; Giacomazzi, Paolo

  • Author_Institution
    Dipt. di Elettron. e Inf., Politec. di Milano, Milan
  • fYear
    2008
  • fDate
    20-22 Oct. 2008
  • Firstpage
    845
  • Lastpage
    849
  • Abstract
    The early identification of applications through the observation and fast analysis of the associated packet flows is a critical building block of intrusion detection and policy enforcement systems. The simple techniques currently used in practice, such as looking at the transport port numbers or at the application payload, are increasingly less effective for new applications using random port numbers and/or encryption. Therefore, there is increasing interest in machine learning techniques capable of identifying applications by examining features of the associated traffic process such as packet lengths and inter-arrival times. However, these techniques require that the classification algorithm is trained with examples of the traffic generated by the applications to be identified, possibly on the link where the the classifier will operate. In this paper we provide two new contributions. First, we apply the C4.5 decision tree algorithm to the problem of early application identification (i.e. looking at the first packets of the flow) and show that it has better performance than the algorithms proposed in the literature. Moreover, we evaluate the performance of the classifier when training is performed on a link different from the link where the classifier operates. This is an important issue, as a pre-trained portable classifier would greatly facilitate the deployment and management of the classification infrastructure.
  • Keywords
    cryptography; decision trees; learning (artificial intelligence); software performance evaluation; telecommunication traffic; C4.5 decision tree algorithm; associated packet flows; associated traffic process; classification algorithm; encryption; intrusion detection; machine learning algorithm; performance evaluation; policy enforcement systems; random port numbers; transport port numbers; Bayesian methods; Classification algorithms; Clustering algorithms; Computer science; Hidden Markov models; Inspection; Machine learning; Machine learning algorithms; Payloads; Peer to peer computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Technology, 2008. IMCSIT 2008. International Multiconference on
  • Conference_Location
    Wisia
  • Print_ISBN
    978-83-60810-14-9
  • Type

    conf

  • DOI
    10.1109/IMCSIT.2008.4747340
  • Filename
    4747340