Developing understanding of hacker language through the use of lexical semantics

Author

Benjamin, Victor ; Hsinchun Chen

Author_Institution

Dept. of Manage. Inf. Syst., Univ. of Arizona, Tucson, AZ, USA

fYear

2015

fDate

27-29 May 2015

Firstpage

79

Lastpage

84

Abstract

The need for more research scrutinizing online hacker communities is a common suggestion in recent years. However, researchers and practitioners face many challenges when attempting to do so. In particular, they may encounter hacking-specific terms, concepts, tools, and other items that are unfamiliar and may be challenging to understand. For these reasons, we are motivated to develop an automated method for developing understanding of hacker language. We utilize the latest advancements in recurrent neural network language models (RNNLMs) to develop an unsupervised machine learning technique for learning hacker language. The selected RNNLM produces state-of-the-art word embeddings that are useful for understanding the relations between different hacker terms and concepts. We evaluate our work by testing the RNNLMs ability to learn relevant relations between known hacker terms. Results suggest that the latest work in RNNLMs can aid in modeling hacker language, providing promising direction for future research.

Keywords

Internet; computer crime; recurrent neural nets; unsupervised learning; RNNLM; lexical semantics; online hacker language; recurrent neural network language model; unsupervised machine learning technique; Approximation methods; Biological system modeling; Communities; Computer crime; Computer hacking; Context; Semantics; Cybersecurity; Hacker community; Language model; Recurrent neural network;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligence and Security Informatics (ISI), 2015 IEEE International Conference on

Conference_Location

Baltimore, MD

Print_ISBN

978-1-4799-9888-3

Type

conf

DOI

10.1109/ISI.2015.7165943

Filename

7165943