Author_Institution :
Sch. of Comput. Sci., Southwest Pet. Univ., Chengdu, China
Abstract :
Some criminals and malcontents attempt to take advantage of others by using malicious web sites. As a result, many systems were developed to prevent the end user from visiting such malicious sites. A lot of approaches were used in these systems, e.g., blacklists were constructed by a range of techniques including manual reporting, honey pots, and Web crawlers. Inevitably, not all the malicious sites are blacklisted. Aim to this problem, some client-side systems were developed to analyze the content or behavior of a Web site as it is visited. But, the run-time overhead can not be avoided. Compared with these approaches, there has an efficient approach to detect malicious web sites. Whereas this approach can obtain 95-99% accuracy, the private information is needed. In this paper, a new strategy for detecting malicious web sites based on privacy preservation is proposed. We use structural partition and Singular Value Decomposition (SVD) technique to protect the private information. Then the Support Vector Machine (SVM) is used for evaluation. Our experimental results indicate that, in comparison with original method, the new strategy has the similar accuracy in detecting large numbers of malicious Web sites from their URLs.
Keywords :
Web sites; data privacy; singular value decomposition; support vector machines; SVD; SVM; Web crawlers; client-side systems; honey pots; malicious Web site dtection; manual reporting; privacy preservation; singular value decomposition technique; structural partition technique; support vector machine; suspicious URL; Accuracy; Data privacy; Internet; Privacy; Singular value decomposition; Support vector machines; Web sites; Classification; Privacy Protection; SVD;