DocumentCode :
1864758
Title :
Using Domain Top-page Similarity Feature in Machine Learning-Based Web Phishing Detection
Author :
Sanglerdsinlapachai, Nuttapong ; Rungsawang, Arnon
Author_Institution :
Thai Comput. Emergency Response Team, Nat. Electron. & Comput. Technol. Center, Pathumthani, Thailand
fYear :
2010
fDate :
9-10 Jan. 2010
Firstpage :
187
Lastpage :
190
Abstract :
This paper presents a study on using a concept feature to detect web phishing problem. Following the features introduced in Carnegie Mellon Anti-phishing and Network Analysis Tool (CANTINA), we applied additional domain top-page similarity feature to a machine learning based phishing detection system. We preliminarily experimented with a small set of 200 web data, consisting of 100 phishing webs and another 100 non-phishing webs. The evaluation result in terms of f-measure was up to 0.9250, with 7.50% of error rate.
Keywords :
Internet; learning (artificial intelligence); security of data; CANTINA; Carnegie Mellon Anti-phishing and Network Analysis Tool; Web phishing detection; Web phishing problem; concept feature; domain top-page similarity feature; machine learning; phishing detection system; Computer crime; Computer vision; Data mining; Error analysis; Knowledge engineering; Machine learning; Neural networks; Support vector machines; Uniform resource locators; Web pages; anti-phishing; domain top-page; machine learning; phishing; semantic similarity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Knowledge Discovery and Data Mining, 2010. WKDD '10. Third International Conference on
Conference_Location :
Phuket
Print_ISBN :
978-1-4244-5397-9
Electronic_ISBN :
978-1-4244-5398-6
Type :
conf
DOI :
10.1109/WKDD.2010.108
Filename :
5432672
Link To Document :
بازگشت