• DocumentCode
    2768493
  • Title

    Web Host Access Tool: A Support Vector Machine Approach

  • Author

    Banerjee, Satarupa ; Cassel, Lillian

  • Author_Institution
    Villanova Univ., Villanova
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    1176
  • Lastpage
    1183
  • Abstract
    Search engines are an integral part of web information gathering and retrieval process, but they are significantly dependent upon the words or word phrases input by the end user. The search engine usually contributes no additional semantic data toward the information acquisition process. This paper presents an intelligent search agent "Web host access tool" (WHAT) based on support vector machines (SVM), which introduces the notion of queries conducted within a specific contextual meaning. Given a context and associated keywords that personalize the search history and preferences of the user, WHAT performs more intelligent resource filtering than conventional search engines, providing more relevant results while filtering the irrelevant references. Search results obtained in the form of text from different search engines are processed by the SVM based word classifier that arrange the results obtained according to user preference obtained from previous search processes. The text materials are processed by Latent Semantic Indexing (LSI) for creating a document matrix that gives the probability of a word occurrence in a specified context. For simplicity, this paper considers 5 different contexts: business, education, entertainment, news and information and tourism. The LSI coefficients were used by SVM to yield confidence levels for each search result and according to that the results were sorted and presented to the end user. As an alternative, least square support vector machines (LS-SVM) are also studied in this paper. The system is updated by the user that provides feedback regarding the search relevance, based on which the LSI coefficients are updated to ensure increased relevance in future searches. Three different types of kernel function were considered in this paper -Linear. Radial basis function (RBF) and polynomial kernel. Results claim that Linear SVM perform better than the others, not only in terms of classification accuracy but also in terms of training s- peed.
  • Keywords
    document handling; search engines; support vector machines; Web host access tool; Web information gathering; Web information retrieval process; document matrix; intelligent resource filtering; intelligent search agent; latent semantic indexing; least square support vector machine; linear radial basis function; polynomial kernel; search engines; Filtering; History; Information retrieval; Intelligent agent; Kernel; Large scale integration; Machine intelligence; Search engines; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2006. IJCNN '06. International Joint Conference on
  • Conference_Location
    Vancouver, BC
  • Print_ISBN
    0-7803-9490-9
  • Type

    conf

  • DOI
    10.1109/IJCNN.2006.246824
  • Filename
    1716235