• DocumentCode
    843587
  • Title

    q-gram matching using tree models

  • Author

    Fogla, Prahlad ; Lee, Wenke

  • Author_Institution
    Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA
  • Volume
    18
  • Issue
    4
  • fYear
    2006
  • fDate
    4/1/2006 12:00:00 AM
  • Firstpage
    433
  • Lastpage
    447
  • Abstract
    q-gram matching is used for approximate substring matching problems in a wide range of application areas, including intrusion detection. In this paper, we present a tree-based model to perform fast linear time q-gram matching. All q-grams present in the text are stored in a tree structure similar to trie. We use a tree redundancy pruning algorithm to reduce the size of the tree without losing any information. We also use suffix links for fast q-gram search during query matching. We compare our work with the Rabin-Karp-based hash-table technique, commonly used for multiple q-gram search. We present results of experiments on system call sequence data used for intrusion detection.
  • Keywords
    computational complexity; data mining; query processing; security of data; string matching; tree data structures; tree searching; Rabin-Karp-based hash-table technique; fast linear time q-gram matching; intrusion detection; multiple q-gram search; pattern matching; query matching; substring matching problem; suffix tree; tree data structure; tree redundancy pruning algorithm; tree-based model; trie structure; word processing; Computational biology; Computer Society; Detectors; Information retrieval; Intrusion detection; Pattern matching; Runtime; Sequences; Signal processing algorithms; Tree data structures; Intrusion detection; pattern matching; q{hbox{-}}{rm gram} matching; search problems; string matching; suffix tree; tree data structure; trees; word processing.;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2006.1599383
  • Filename
    1599383