DocumentCode :
3318321
Title :
Web sensitive text filtering by combining semantics and statistics
Author :
Wu, Ou ; Hu, Weiming
Author_Institution :
Nat. Lab. of Pattern Recognition, Chinese Acad. of Sci., Beijing, China
fYear :
2005
fDate :
30 Oct.-1 Nov. 2005
Firstpage :
663
Lastpage :
667
Abstract :
Web sensitive information is defined as texts, pictures and other forms of information which contain erotic content on Web. How to filter this harmful information attracts researchers´ interests. In order to keep Web content safe, governments have also given great support on the research on this problem. This paper first briefly review recent developments in Web sensitive information filtering then the statistic and semantic features of sensitive texts are analyzed and represented by a CNN-like word net. Finally a novel method which combines semantics and statistics is proposed to filter sensitive text on Web. Experimental results have demonstrated the proposed method´s promising performance.
Keywords :
Internet; information filtering; statistical analysis; text analysis; CNN-like word net; Web sensitive information filtering; erotic Web content; Automation; Government; Information filtering; Information filters; Internet; Laboratories; Pattern recognition; Statistics; Uniform resource locators; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN :
0-7803-9361-9
Type :
conf
DOI :
10.1109/NLPKE.2005.1598819
Filename :
1598819
Link To Document :
بازگشت