• DocumentCode
    731004
  • Title

    Rapid and parallel content screening for detecting transformed data exposure

  • Author

    Xiaokui Shu ; Jing Zhang ; Danfeng Yao ; Wu-Chun Feng

  • Author_Institution
    Dept. of Comput. Sci., Virginia Tech, Blacksburg, VA, USA
  • fYear
    2015
  • fDate
    April 26 2015-May 1 2015
  • Firstpage
    191
  • Lastpage
    196
  • Abstract
    The leak of sensitive data on computer systems poses a serious threat to organizational security. Organizations need to identify the exposure of sensitive data by screening the content in storage and transmission, i.e., to detect sensitive information being stored or transmitted in the clear. However, detecting the exposure of sensitive information is challenging due to data transformation in the content. Transformations (such as insertion, deletion) result in highly unpredictable leak patterns. Existing automata-based string matching algorithms are impractical for detecting transformed data leaks because of its formidable complexity when modeling the required regular expressions. We design two new algorithms for detecting long and inexact data leaks. Our system achieves high detection accuracy in recognizing transformed leaks compared with the state-of-the-art inspection methods. We parallelize our prototype on graphics processing unit and demonstrate the strong scalability of our data leak detection solution analyzing big data.
  • Keywords
    Big Data; security of data; Big Data analysis; automata-based string matching algorithms; data leak detection solution; graphics processing unit; organizational security; sensitive data; Accuracy; Algorithm design and analysis; Graphics processing units; Heuristic algorithms; Leak detection; Security; Sensitivity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Communications Workshops (INFOCOM WKSHPS), 2015 IEEE Conference on
  • Conference_Location
    Hong Kong
  • Type

    conf

  • DOI
    10.1109/INFCOMW.2015.7179383
  • Filename
    7179383