DocumentCode
1827977
Title
PACE: Pattern Accurate Computationally Efficient Bootstrapping for Timely Discovery of Cyber-security Concepts
Author
McNeil, Nikki ; Bridges, Robert A. ; Iannacone, Michael D. ; Czejdo, Bogdan ; Perez, Noel ; Goodall, John R.
Author_Institution
Dept. of Math., Univ. of Maryland, Baltimore, MD, USA
Volume
2
fYear
2013
fDate
4-7 Dec. 2013
Firstpage
60
Lastpage
65
Abstract
Public disclosure of important security information, such as knowledge of vulnerabilities or exploits, often occurs in blogs, tweets, mailing lists, and other online sources significantly before proper classification into structured databases. In order to facilitate timely discovery of such knowledge, we propose a novel semi-supervised learning algorithm, PACE, for identifying and classifying relevant entities in text sources. The main contribution of this paper is an enhancement of the traditional bootstrapping method for entity extraction by employing a time-memory trade-off that simultaneously circumvents a costly corpus search while strengthening pattern nomination, which should increase accuracy. An implementation in the cyber-security domain is discussed as well as challenges to Natural Language Processing imposed by the security domain.
Keywords
Internet; database management systems; learning (artificial intelligence); security of data; social networking (online); PACE; blogs; cyber security concepts; cyber-security domain; information security; mailing lists; natural language processing; online sources; pattern accurate computationally efficient bootstrapping; pattern nomination; semisupervised learning algorithm; structured databases; tweets; Blogs; Computer security; Context; Databases; Pattern matching; Training data; Bootstrapping; Cyber-Security; Entity Extraction; Natural Language Processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications (ICMLA), 2013 12th International Conference on
Conference_Location
Miami, FL
Type
conf
DOI
10.1109/ICMLA.2013.106
Filename
6786082
Link To Document