Title :
Approximate string matching algorithm for phishing detection
Author :
Abraham, Dona ; Raj, Nisha S.
Author_Institution :
Dept. of Comput. Sci. & Eng., SCMS Sch. of Eng. & Technol., Ernakulam, India
Abstract :
Phishing is an act of stealing personal and sensitive user information through internet and using it for financial transactions. The goal of phishers is to carry out fraudulent transactions on behalf of the victims by using the information stealed from them. Availing the services of internet has become a dangerous task to the common people with these kinds of attacks. Many methods have been developed to fight against phishing attacks. But, as the attacker uses more sophisticated techniques each method fails to perform well in detecting the attacks. Here we propose a string matching method for detecting phishing attacks, which determines the degree of similarity a URL is having with the blacklisted URLs. Thus based on the textual properties of a URL it can be classified as phishing or non-phishing. Two string matching algorithms i.e. Longest Common Subsequence (LCS) and Edit Distance are used in the hostname comparison. The accuracy rate obtained for LCS is 99.1% and for Edit Distance it is 99.5%.
Keywords :
Internet; computer crime; string matching; text analysis; unsolicited e-mail; Internet; LCS; URL textual properties; approximate string matching algorithm; edit distance; fraudulent transactions; longest common subsequence; personal information stealing; phishing attack detection; sensitive user information stealing; Accuracy; Electronic mail; Feature extraction; IP networks; Internet; Training; Uniform resource locators; Approximate String matching; Blacklist; Edit Distance; Longest Common Subsequence(LCS); Phishing Attacks;
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on
Conference_Location :
New Delhi
Print_ISBN :
978-1-4799-3078-4
DOI :
10.1109/ICACCI.2014.6968578