• DocumentCode
    1866326
  • Title

    Flow-level spam modelling using separate data sources

  • Author

    Luckner, Marcin ; Filasiak, Robert

  • Author_Institution
    Fac. of Math. & Inf. Sci., Warsaw Univ. of Technol., Warsaw, Poland
  • fYear
    2013
  • fDate
    8-11 Sept. 2013
  • Firstpage
    91
  • Lastpage
    98
  • Abstract
    Spam detection based on flow-level statistics is a new approach in anti-spam techniques. The approach reduces number of collected data but still can obtain relative good results in a spam detection task. The main problems in the approach are selection of flow-level features that describe spam and detection of discrimination rules. In this work, flow-level model of spam is presented. The model describes spam subclasses and brings information about major features of a spam detection task. The model is the base for decision trees that detect spam. The analysis of detectors, which was learned from data collected from different mail servers, results in the universal spam description consists of the most significant features. Flows described by selected features and collected on Broadband Remote Access Server were analysed by an ensemble of created classifiers. The ensemble detected major sources of spam among senders IP addresses.
  • Keywords
    decision trees; statistical analysis; unsolicited e-mail; antispam technique; broadband remote access server; decision trees; flow-level feature; flow-level spam modelling; flow-level statistics; mail server; spam detection task; spam subclasses; Accuracy; Data models; Decision trees; IP networks; Servers; Unsolicited electronic mail; Anomaly detection; Flow analysis; Spam detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Systems (FedCSIS), 2013 Federated Conference on
  • Conference_Location
    Krako??w
  • Type

    conf

  • Filename
    6643981