• DocumentCode
    3156128
  • Title

    Internet Traffic Classification Using Machine Learning

  • Author

    Jun, Li ; Shunyi, Zhang ; Yanqing, Lu ; Zailong, Zhang

  • Author_Institution
    Nanjing Univ. of Posts & Telecommun., Nanjing
  • fYear
    2007
  • fDate
    22-24 Aug. 2007
  • Firstpage
    239
  • Lastpage
    243
  • Abstract
    Internet traffic identification and classification is vital to the areas of network management and security monitoring, network planning, and QoS provision. Traditional approaches such as port-based and payload-based identification are becoming increasingly difficult with many new applications (e.g. P2P) using dynamic port numbers, masquerading techniques, and encryption to avoid detection. An alternative approach is to classify traffic by exploiting the distinctive characteristics of flow statistics. We present here a traffic classification scheme based on machine learning (ML). The performance impact of the dataset size, feature selection and ML algorithm selection is demonstrated by experiments. The genetic algorithm based feature selection can dramatically reduce the ML learning and modeling time with less decrease or even a bit increase in classification accuracy. The chosen ML algorithms: TAN, C4.5, NBTree, RandomForest and distance weighted KNN, can reach high classification accuracy. Typically, C4.5 and RandomForest are superior to other ML algorithms in computational complexity. Besides, experiments show that the size of data set would impact on the classification performance, and tuning dataset´s size could meet the requirements of specific applications.
  • Keywords
    Internet; computer network management; genetic algorithms; quality of service; telecommunication network planning; telecommunication traffic; Internet traffic classification; Internet traffic identification; QoS provision; feature selection; genetic algorithm; machine learning; network management; network planning; payload-based identification; security monitoring; Cryptography; Genetic algorithms; IP networks; Internet; Machine learning; Machine learning algorithms; Monitoring; Statistics; Telecommunication traffic; Traffic control; Feature Selection; Machine Learning (ML); Traffic classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications and Networking in China, 2007. CHINACOM '07. Second International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-1009-5
  • Electronic_ISBN
    978-1-4244-1009-5
  • Type

    conf

  • DOI
    10.1109/CHINACOM.2007.4469372
  • Filename
    4469372