• DocumentCode
    2447331
  • Title

    A configurable-hardware document-similarity classifier to detect web attacks

  • Author

    Ulmer, Craig ; Gokhale, Maya

  • Author_Institution
    Sandia Nat. Labs., Livermore, CA, USA
  • fYear
    2010
  • fDate
    19-23 April 2010
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    This paper describes our approach to adapting a text document similarity classifier based on the Term Frequency Inverse Document Frequency (TFIDF) metric to reconfigurable hardware. The TFIDF classifier is used to detect web attacks in HTTP data. In our reconfigurable hardware approach, we design a streaming, real-time classifier by simplifying an existing sequential algorithm and manipulating the classifier´s model to allow decision information to be represented compactly. We have developed a set of software tools to help automate the process of converting training data to synthesizable hardware and to provide a means of trading off between accuracy and resource utilization. The Xilinx Virtex 5-LX implementation requires two orders of magnitude less memory than the original algorithm. At 166MB/s (80X the software) the hardware implementation is able to achieve Gigabit network throughput at the same accuracy as the original algorithm.
  • Keywords
    Internet; field programmable gate arrays; pattern classification; security of data; text analysis; TFIDF metric; Web attack detection; Xilinx Virtex 5-LX; configurable-hardware document; document similarity classifier; reconfigurable hardware; term frequency inverse document frequency; Algorithm design and analysis; Frequency; Hardware; Laboratories; Network synthesis; Resource management; Software tools; Throughput; Training data; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on
  • Conference_Location
    Atlanta, GA
  • Print_ISBN
    978-1-4244-6533-0
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2010.5470737
  • Filename
    5470737