DocumentCode :
686401
Title :
Multi-level web content filter model based on MapReduce
Author :
Bin Wu ; Xing Lin ; Dongmei Zhang
Author_Institution :
Inf. Security Center, Beijing Univ. of Posts & Telecommun., Beijing, China
fYear :
2013
fDate :
22-24 Nov. 2013
Firstpage :
1
Lastpage :
6
Abstract :
In order to solve the problem that the traditional web filtering systems cannot filter the sensitive web pages effectively in real time, a multi-level web content filtering model based on MapReduce is proposed. Three kinds of filtering strategies are employed in this model, which are blocking of IP address and URL, keyword filtering and intelligent filtering. In intelligent filtering mechanism, the improved Knn algorithm based on maximum category space is adopted to classify the web pages intelligently and filter out the sensitive ones. To reduce the filtering time, we propose a parallelization framework based on the distributed computing MapReduce. The framework will carry out the large calculation of feature vector and Euclidean distance in parallel, which improves the filter efficiency a lot. Experimental result shows that the distributed filtering model can filter the sensitive web pages effectively in real time, and higher rate of web filtering performance with high accuracy can by achieved by the increasing of the distributed nodes.
Keywords :
Internet; information filtering; parallel processing; Euclidean distance; IP address; MapReduce; URL; distributed computing; distributed filtering model; feature vector; improved Knn algorithm; intelligent filtering mechanism; keyword filtering; maximum category space; multilevel Web content filtering model; parallelization framework; sensitive Web pages; MapReduce; Web page filtering; distributed computing; feature item vector; improved Knn algorithm; intelligent content filtering; multi-level; parallel process; training sample set; web classification;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Information and Network Security (ICINS 2013), 2013 International Conference on
Conference_Location :
Beijing
Electronic_ISBN :
978-1-84919-729-8
Type :
conf
DOI :
10.1049/cp.2013.2468
Filename :
6826017
Link To Document :
بازگشت