• DocumentCode
    2076182
  • Title

    Network traffic clustering using Random Forest proximities

  • Author

    Yu Wang ; Yang Xiang ; Jun Zhang

  • Author_Institution
    Sch. of Inf. Technol., Deakin Univ., Melbourne, VIC, Australia
  • fYear
    2013
  • fDate
    9-13 June 2013
  • Firstpage
    2058
  • Lastpage
    2062
  • Abstract
    The recent years have seen extensive work on statistics-based network traffic classification using machine learning (ML) techniques. In the particular scenario of learning from unlabeled traffic data, some classic unsupervised clustering algorithms (e.g. K-Means and EM) have been applied but the reported results are unsatisfactory in terms of low accuracy. This paper presents a novel approach for the task, which performs clustering based on Random Forest (RF) proximities instead of Euclidean distances. The approach consists of two steps. In the first step, we derive a proximity measure for each pair of data points by performing a RF classification on the original data and a set of synthetic data. In the next step, we perform a K-Medoids clustering to partition the data points into K groups based on the proximity matrix. Evaluations have been conducted on real-world Internet traffic traces and the experimental results indicate that the proposed approach is more accurate than the previous methods.
  • Keywords
    Internet; learning (artificial intelligence); pattern classification; pattern clustering; statistics; telecommunication traffic; Euclidean distances; ML techniques; RF classification; RF proximities; k-medoids clustering; machine learning techniques; network traffic clustering; proximity matrix; proximity measure; random forest proximities; real-world Internet traffic traces; statistics-based network traffic classification; unlabeled traffic data; unsupervised clustering algorithms; Accuracy; Classification algorithms; Clustering algorithms; IP networks; Internet; Radio frequency; Telecommunication traffic; Clustering; Machine Learning; Traffic Analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications (ICC), 2013 IEEE International Conference on
  • Conference_Location
    Budapest
  • ISSN
    1550-3607
  • Type

    conf

  • DOI
    10.1109/ICC.2013.6654829
  • Filename
    6654829