• DocumentCode
    1611174
  • Title

    Discovering Packet Structure through Lightweight Hierarchical Clustering

  • Author

    Hijazi, Abdulrahman ; Inoue, Hajime ; Matrawy, Ashraf ; Van Oorschot, P.C. ; Somayaji, Anil

  • Author_Institution
    Carleton Comput. Security Lab., Carleton Univ., Ottawa, ON
  • fYear
    2008
  • Firstpage
    33
  • Lastpage
    39
  • Abstract
    The complexity of current Internet applications makes understanding network traffic a challenging task. By providing larger-scale aggregates for analysis, unsupervised clustering approaches can greatly aid in the identification of new applications, attacks, and other changes in network usage patterns. In this paper we introduce ADHIC, a new algorithm that clusters similar network traffic together without prior knowledge of protocol structures. Packet similarity is determined through comparisons of substrings within packets at distinguishing offsets. ADHIC is notable in that it 1) produces a hierarchical decomposition of network traffic in the form of a cluster-identifying decision tree, 2) needs only a small fraction of packets to generate the tree, and 3) clusters packets at wire speeds. We find that ADHIC appropriately segregates well-known protocols, clusters together traffic of the same protocol running on multiple ports, and segregates traffic from applications, such as p2p, that do not use standard ports. Potential applications include network performance analysis, real-time alerts of flash crowds or worm activity, and dynamic DoS-resistant bandwidth management. NetADHICT, our implementation of ADHIC, is available for download and is licensed under the GNU GPL license.
  • Keywords
    Internet; bandwidth allocation; decision trees; invasive software; pattern clustering; protocols; telecommunication security; telecommunication traffic; ADHIC algorithm; Internet; decision tree; dynamic DoS-resistant bandwidth management; lightweight hierarchical clustering; network performance analysis; network traffic; packet structure discovery; protocol; unsupervised clustering; worm activity; Aggregates; Bandwidth; Clustering algorithms; Decision trees; IP networks; Pattern analysis; Performance analysis; Protocols; Telecommunication traffic; Wire;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications, 2008. ICC '08. IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-2075-9
  • Electronic_ISBN
    978-1-4244-2075-9
  • Type

    conf

  • DOI
    10.1109/ICC.2008.15
  • Filename
    4533051