• DocumentCode
    686401
  • Title

    Multi-level web content filter model based on MapReduce

  • Author

    Bin Wu ; Xing Lin ; Dongmei Zhang

  • Author_Institution
    Inf. Security Center, Beijing Univ. of Posts & Telecommun., Beijing, China
  • fYear
    2013
  • fDate
    22-24 Nov. 2013
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    In order to solve the problem that the traditional web filtering systems cannot filter the sensitive web pages effectively in real time, a multi-level web content filtering model based on MapReduce is proposed. Three kinds of filtering strategies are employed in this model, which are blocking of IP address and URL, keyword filtering and intelligent filtering. In intelligent filtering mechanism, the improved Knn algorithm based on maximum category space is adopted to classify the web pages intelligently and filter out the sensitive ones. To reduce the filtering time, we propose a parallelization framework based on the distributed computing MapReduce. The framework will carry out the large calculation of feature vector and Euclidean distance in parallel, which improves the filter efficiency a lot. Experimental result shows that the distributed filtering model can filter the sensitive web pages effectively in real time, and higher rate of web filtering performance with high accuracy can by achieved by the increasing of the distributed nodes.
  • Keywords
    Internet; information filtering; parallel processing; Euclidean distance; IP address; MapReduce; URL; distributed computing; distributed filtering model; feature vector; improved Knn algorithm; intelligent filtering mechanism; keyword filtering; maximum category space; multilevel Web content filtering model; parallelization framework; sensitive Web pages; MapReduce; Web page filtering; distributed computing; feature item vector; improved Knn algorithm; intelligent content filtering; multi-level; parallel process; training sample set; web classification;
  • fLanguage
    English
  • Publisher
    iet
  • Conference_Titel
    Information and Network Security (ICINS 2013), 2013 International Conference on
  • Conference_Location
    Beijing
  • Electronic_ISBN
    978-1-84919-729-8
  • Type

    conf

  • DOI
    10.1049/cp.2013.2468
  • Filename
    6826017