• DocumentCode
    2721670
  • Title

    Machine Learning Techniques for Automated Web Page Classification Using URL Features

  • Author

    Devi, M. Indra ; Rajaram, Dr R. ; Selvakuberan, K.

  • Author_Institution
    Thiagarajar Coll. of Eng., Madurai
  • Volume
    2
  • fYear
    2007
  • fDate
    13-15 Dec. 2007
  • Firstpage
    116
  • Lastpage
    120
  • Abstract
    Explosive growth of the Internet makes it difficult for search engines to give relevant results to the users within a stipulated time. Search engines store the Web pages in classified directories and for this process even though some search engines depend on human expertise; most of the search engines use automated methods for classification of web pages. In this paper we use machine-learning techniques for the automated classification of Web pages. We consider only URL features for classification as the URL name is unique, meaningful and helps identification of their subject categories most of the times. Experimental results show that machine learning techniques for automated classification of Web pages with URL features proves to be the best and more useful method for search engines.
  • Keywords
    Internet; learning (artificial intelligence); pattern classification; search engines; Internet; URL; automated Web page classification; machine learning; search engines; Educational institutions; Humans; Internet; Machine learning; Machine learning algorithms; Search engines; Support vector machine classification; Support vector machines; Uniform resource locators; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Conference on Computational Intelligence and Multimedia Applications, 2007. International Conference on
  • Conference_Location
    Sivakasi, Tamil Nadu
  • Print_ISBN
    0-7695-3050-8
  • Type

    conf

  • DOI
    10.1109/ICCIMA.2007.342
  • Filename
    4426680