• DocumentCode
    3399089
  • Title

    An approach for measuring semantic similarity between words using SVM and LS-SVM

  • Author

    Lavanya, S. ; Arya, S.S.

  • Author_Institution
    Dept. of CSE, Sona Coll. of Technol., Salem, India
  • fYear
    2012
  • fDate
    10-12 Jan. 2012
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Measuring semantic similarity between words plays vital role in information retrieval and natural language processing. The existing system uses page counts and snippets retrieved by a search engine to measure semantic similarity between words. Various similarity scores are calculated from the page counts retrieved by the search engine for the queried conjunctive words. A lexical pattern extraction algorithm identifies the patterns from the snippets. Different patterns showing the same semantic relation are clustered using a lexical pattern clustering algorithm. The existing system makes use of Support Vector Machines to combine the similarity scores from page counts and clusters of patterns from snippets for measuring similarity. We propose a different machine learning approach called Latent Structural Support Vector Machine which can handle the missing data values which occurs frequently in statistical data analysis. The proposed system also makes a comparative study between similarity results from both SVM and LS-SVM.
  • Keywords
    data analysis; learning (artificial intelligence); natural language processing; pattern clustering; query processing; search engines; semantic Web; statistical analysis; support vector machines; LS-SVM; information retrieval; latent structural support vector machine; lexical pattern clustering algorithm; lexical pattern extraction algorithm; machine learning approach; missing data value handling; natural language processing; page counts; page snippets; queried conjunctive words; search engine; semantic relation; semantic similarity measurement; similarity scores; statistical data analysis; Computers; Machine learning; Search engines; Semantics; Support vector machines; Training; Web search; Latent Support Vector Machine; Support Vector Machine; Web Mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Communication and Informatics (ICCCI), 2012 International Conference on
  • Conference_Location
    Coimbatore
  • Print_ISBN
    978-1-4577-1580-8
  • Type

    conf

  • DOI
    10.1109/ICCCI.2012.6158835
  • Filename
    6158835