• DocumentCode
    3286241
  • Title

    Web Page Categorization Based on k-NN and SVM Hybrid Pattern Recognition Algorithm

  • Author

    Shi, Xuelin ; Zhao, Ying ; Dong, Xiangjun

  • Author_Institution
    Sch. of Inf. Sci. & Technol., Beijing Univ. of Chem. Technol., Beijing
  • Volume
    2
  • fYear
    2008
  • fDate
    18-20 Oct. 2008
  • Firstpage
    523
  • Lastpage
    527
  • Abstract
    Traditional information retrieval (IR) method use keywords matching to filter the documents, but usually retrieves unrelated Web pages. In order to effectively classify Web pages, we present a Web page categorization algorithm, named WebPSC (Web page similarity categorization). This algorithm uses latent semantic indexing (LSI) to model Web pages and implement categorization based on hybrid pattern recognition algorithm of the k-NN and support vector machine (SVM). As an implementation of WebPSC, an intelligent agent system to acquire user interest and help user retrieving Web pages is presented. Empirical results of using this algorithm indicate our method can reach high levels of accuracy in Web page classification.
  • Keywords
    Web sites; information retrieval; pattern recognition; support vector machines; SVM; Web page similarity categorization; information retrieval; intelligent agent system; k-NN; k-nearest neighbor; latent semantic indexing; pattern recognition algorithm; support vector machine; Indexing; Information filtering; Information filters; Information retrieval; Large scale integration; Matched filters; Pattern recognition; Support vector machine classification; Support vector machines; Web pages; Categorization; Latent Semantic Indexing; Singular Value Decomposition; Support Vector Machine; k-NN;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
  • Conference_Location
    Shandong
  • Print_ISBN
    978-0-7695-3305-6
  • Type

    conf

  • DOI
    10.1109/FSKD.2008.574
  • Filename
    4666172