• DocumentCode
    2036294
  • Title

    Semi-automated feature selection for web text filtering

  • Author

    Chen, Ying ; Wu, Ou

  • Author_Institution
    Beijing Electron. Sci. & Technol. Inst., Beijing, China
  • Volume
    6
  • fYear
    2010
  • fDate
    10-12 Aug. 2010
  • Firstpage
    2513
  • Lastpage
    2517
  • Abstract
    The explosive growth of the Internet inevitably leads to the proliferation of harmful information such as pornography, drug and violence. A great deal of filtering techniques based on image and text categorization is proposed in the literature. Among them, text-based filtering plays a leading role for its good performance. Existing text filtering algorithms can be seen as a classical text categorization approach of discerning two topics, i.e. harmful and benign. In this paper, motivated by the linguistic character of text features and other related text classification tasks such as genre detection, a new feature selection framework for text filtering is proposed. It combines linguistics and domain knowledge in an effective way. Experimental results have demonstrated that our method is more adapt to special domain text filtering tasks.
  • Keywords
    Internet; classification; feature extraction; image processing; information filtering; text analysis; Internet; Web text filtering; domain knowledge; genre detection; image categorization; linguistics; semiautomated feature selection; text categorization; text classification; text feature; Conferences; Construction industry; Feature extraction; Filtering; Tagging; Text categorization; Training; Web filtering; feature selection; semi-automated;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-1-4244-5931-5
  • Type

    conf

  • DOI
    10.1109/FSKD.2010.5569606
  • Filename
    5569606