DocumentCode :
1717633
Title :
Training on multiple sub-flows to optimise the use of Machine Learning classifiers in real-world IP networks
Author :
Nguyen, Thuy T T ; Armitage, Grenville
Author_Institution :
Centre for Adv. Internet Archit., Swinburne Univ. of Technol., Melbourne, Vic.
fYear :
2006
Firstpage :
369
Lastpage :
376
Abstract :
Literature on the use of machine learning (ML) algorithms for classifying IP traffic has relied on full-flows or the first few packets of flows. In contrast, many real-world scenarios require a classification decision well before a flow has finished even if the flow´s beginning is lost. This implies classification must be achieved using statistics derived from the most recent N packets taken at any arbitrary point in a flow´s lifetime. We propose training the classifier on a combination of short sub-flows (extracted from full-flow examples of the target application´s traffic). We demonstrate this optimisation using the naive Bayes ML algorithm, and show that our approach results in excellent performance even when classification is initiated mid-way through a flow with windows as small as 25 packets long. We suggest future use of unsupervised ML algorithms to identify optimal sub-flows for training
Keywords :
Bayes methods; IP networks; learning (artificial intelligence); telecommunication traffic; IP network; IP traffic; machine learning classifier; naive Bayes algorithm; Government; IP networks; Inspection; Intrusion detection; Machine learning; Machine learning algorithms; Payloads; Protocols; TCPIP; Telecommunication traffic;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Local Computer Networks, Proceedings 2006 31st IEEE Conference on
Conference_Location :
Tampa, FL
ISSN :
0742-1303
Print_ISBN :
1-4244-0418-5
Electronic_ISBN :
0742-1303
Type :
conf
DOI :
10.1109/LCN.2006.322122
Filename :
4116573
Link To Document :
بازگشت