DocumentCode
571480
Title
Detecting Malicious Websites by Learning IP Address Features
Author
Chiba, Daiki ; Tobe, Kazuhiro ; Mori, Tatsuya ; Goto, Shigeki
Author_Institution
Dept. of Comput. Sci. & Eng., Waseda Univ., Tokyo, Japan
fYear
2012
fDate
16-20 July 2012
Firstpage
29
Lastpage
39
Abstract
Web-based malware attacks have become one of the most serious threats that need to be addressed urgently. Several approaches that have attracted attention as promising ways of detecting such malware include employing various blacklists. However, these conventional approaches often fail to detect new attacks owing to the versatility of malicious websites. Thus, it is difficult to maintain up-to-date blacklists with information regarding new malicious websites. To tackle this problem, we propose a new method for detecting malicious websites using the characteristics of IP addresses. Our approach leverages the empirical observation that IP addresses are more stable than other metrics such as URL and DNS. While the strings that form URLs or domain names are highly variable, IP addresses are less variable, i.e., IPv4 address space is mapped onto 4-bytes strings. We develop a lightweight and scalable detection scheme based on the machine learning technique. The aim of this study is not to provide a single solution that effectively detects web-based malware but to develop a technique that compensates the drawbacks of existing approaches. We validate the effectiveness of our approach by using real IP address data from existing blacklists and real traffic data on a campus network. The results demonstrate that our method can expand the coverage/accuracy of existing blacklists and also detect unknown malicious websites that are not covered by conventional approaches.
Keywords
Web sites; invasive software; learning (artificial intelligence); 4-bytes strings; DNS; IP address features; IPv4 address space; URL; Web-based malware attacks; blacklists; campus network; machine learning technique; malicious Websites detection; traffic data; Browsers; Feature extraction; IP networks; Malware; Support vector machines; Training; Vectors; Blacklist; Drive-by-download; IP address; Machine learning; Web-based malware;
fLanguage
English
Publisher
ieee
Conference_Titel
Applications and the Internet (SAINT), 2012 IEEE/IPSJ 12th International Symposium on
Conference_Location
Izmir
Print_ISBN
978-1-4673-2001-6
Electronic_ISBN
978-0-7695-4737-4
Type
conf
DOI
10.1109/SAINT.2012.14
Filename
6305258
Link To Document