Title :
A Utility-Based Web Content Sensitivity Mining Approach
Author :
Wang, Cheng ; Liu, Ying ; Jian, Liheng ; Zhang, Peng
Author_Institution :
Agilent Technol. Co. Ltd., Beijing
Abstract :
Abnormal remarks on World Wide Web, such as violence, threat, superstition, etc. may disturb the social order and public morality. Most traditional methods filter a page as long as it contains a keyword in a predefined blacklist. Such methods cannot provide a quantitative measure of how sensitive the content is. In this paper, we propose a utility-based Web content sensitivity mining approach. Utility is viewed as the measure of how sensitive a page is. It allows the Internet regulators to take different operations according to different sensitivity values. We apply our approach on a real-world Web dataset. It identified a number of sensitive Web pages that traditional frequency-based methods failed to find. By varying the sensitive values of the keywords, different sets of high sensitivity keywords were discovered.
Keywords :
Internet; content management; data mining; utility theory; Internet regulator; sensitive value; utility-based Web content sensitivity mining approach; Data security; Databases; Frequency; Information filters; Intelligent agent; Internet; Itemsets; Monitoring; Regulators; Web pages; Web audit; Web content mining; public opinion monitoring; utility mining;
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
DOI :
10.1109/WIIAT.2008.203