• DocumentCode
    2563450
  • Title

    Research on Algorithm of Web Classification Based on EP and FFSS

  • Author

    Wang, WeiPing ; Wang, Zufeng

  • fYear
    2007
  • fDate
    15-19 Dec. 2007
  • Firstpage
    162
  • Lastpage
    166
  • Abstract
    In this paper, we present a new algorithm of web classifi- cation by combining extended pages (EP) and fair feature- subset selection (FFSS). As the importance of hyperlink, we extend web pages by anchor text. In extended pages, the proportion of the useful feature increases, so we can im- prove the solution of the web classification. In view of using the structure of the web, we get extended pages by append- ing the sentence or the paragraph including anchor text to the original pages. Fair feature-subset selection not only gives fair treatment to each category but also has ability to identify useful features, including both positive and negative features, so it can address the issue of high dimensionality of vector space. Experiments show that the new algorithm enhances the precision and recall of the traditional method.
  • Keywords
    Classification algorithms; Computational intelligence; Feature extraction; Information management; Information security; Pattern recognition; Space technology; Support vector machine classification; Support vector machines; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Security, 2007 International Conference on
  • Conference_Location
    Harbin
  • Print_ISBN
    0-7695-3072-9
  • Electronic_ISBN
    978-0-7695-3072-7
  • Type

    conf

  • DOI
    10.1109/CIS.2007.152
  • Filename
    4415323