DocumentCode
2036294
Title
Semi-automated feature selection for web text filtering
Author
Chen, Ying ; Wu, Ou
Author_Institution
Beijing Electron. Sci. & Technol. Inst., Beijing, China
Volume
6
fYear
2010
fDate
10-12 Aug. 2010
Firstpage
2513
Lastpage
2517
Abstract
The explosive growth of the Internet inevitably leads to the proliferation of harmful information such as pornography, drug and violence. A great deal of filtering techniques based on image and text categorization is proposed in the literature. Among them, text-based filtering plays a leading role for its good performance. Existing text filtering algorithms can be seen as a classical text categorization approach of discerning two topics, i.e. harmful and benign. In this paper, motivated by the linguistic character of text features and other related text classification tasks such as genre detection, a new feature selection framework for text filtering is proposed. It combines linguistics and domain knowledge in an effective way. Experimental results have demonstrated that our method is more adapt to special domain text filtering tasks.
Keywords
Internet; classification; feature extraction; image processing; information filtering; text analysis; Internet; Web text filtering; domain knowledge; genre detection; image categorization; linguistics; semiautomated feature selection; text categorization; text classification; text feature; Conferences; Construction industry; Feature extraction; Filtering; Tagging; Text categorization; Training; Web filtering; feature selection; semi-automated;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location
Yantai, Shandong
Print_ISBN
978-1-4244-5931-5
Type
conf
DOI
10.1109/FSKD.2010.5569606
Filename
5569606
Link To Document