• DocumentCode
    162593
  • Title

    User Profile Discovery for Web Search

  • Author

    Gopalakrishnan, T. ; Segottuvelan, P. ; Sathyamoorthy, J.

  • Author_Institution
    Dept. of Inf. Technol., Bannari Amman Inst. of Technol., Sathyamangalam, India
  • fYear
    2014
  • fDate
    6-7 March 2014
  • Firstpage
    377
  • Lastpage
    381
  • Abstract
    The web has not achieved its goal of providing easy access to online information. As its size is increasing the abandons of available info on the web cause the testing phenomenon of information overload to web users. The system implements an experiential process to approximate semantic likely-hood using page calculations and text fragments retrieved from a web search engine for two words. Specifically, we define various word co-occurrence measures using page counts and integrate those with lexical patterns extracted from text snippets. To identify the numerous semantic relations that exist between two given words, we propose a novel pattern extraction algorithm and a pattern clustering algorithm. The optimal combination of page counts-based co-occurrence measures and lexical pattern clusters is learned using support vector machines. The proposed method outperforms various baselines and previously proposed web-based semantic similarity measures on three benchmark data sets showing a high correlation with human ratings. Moreover, the proposed method significantly improves the accuracy in a community mining task.
  • Keywords
    pattern clustering; search engines; support vector machines; text analysis; Web search engine; Web users; Web-based semantic similarity measures; community mining task; human ratings; lexical pattern clusters; lexical patterns; online information access; page counts-based co-occurrence measures; pattern clustering algorithm; pattern extraction algorithm; support vector machines; text fragments; text snippets; user profile discovery; word co-occurrence measures; Accuracy; Biological system modeling; Clustering algorithms; Communities; Information technology; Measurement; Web search; Clustering; Profile-Library; classification; sequence of commands; user behavior;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Computing Applications (ICICA), 2014 International Conference on
  • Conference_Location
    Coimbatore
  • Type

    conf

  • DOI
    10.1109/ICICA.2014.83
  • Filename
    6965075