DocumentCode
1866326
Title
Flow-level spam modelling using separate data sources
Author
Luckner, Marcin ; Filasiak, Robert
Author_Institution
Fac. of Math. & Inf. Sci., Warsaw Univ. of Technol., Warsaw, Poland
fYear
2013
fDate
8-11 Sept. 2013
Firstpage
91
Lastpage
98
Abstract
Spam detection based on flow-level statistics is a new approach in anti-spam techniques. The approach reduces number of collected data but still can obtain relative good results in a spam detection task. The main problems in the approach are selection of flow-level features that describe spam and detection of discrimination rules. In this work, flow-level model of spam is presented. The model describes spam subclasses and brings information about major features of a spam detection task. The model is the base for decision trees that detect spam. The analysis of detectors, which was learned from data collected from different mail servers, results in the universal spam description consists of the most significant features. Flows described by selected features and collected on Broadband Remote Access Server were analysed by an ensemble of created classifiers. The ensemble detected major sources of spam among senders IP addresses.
Keywords
decision trees; statistical analysis; unsolicited e-mail; antispam technique; broadband remote access server; decision trees; flow-level feature; flow-level spam modelling; flow-level statistics; mail server; spam detection task; spam subclasses; Accuracy; Data models; Decision trees; IP networks; Servers; Unsolicited electronic mail; Anomaly detection; Flow analysis; Spam detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Information Systems (FedCSIS), 2013 Federated Conference on
Conference_Location
Krako??w
Type
conf
Filename
6643981
Link To Document