• DocumentCode
    2548733
  • Title

    Improving web search ranking by incorporating summarization

  • Author

    Meng, Xian-jun ; Chen, Qing-cai ; Wang, Xiao-long ; Yang, Xiao-hong

  • Author_Institution
    Harbin Inst. of Technol., Shenzhen
  • fYear
    2007
  • fDate
    7-10 Oct. 2007
  • Firstpage
    3075
  • Lastpage
    3080
  • Abstract
    Though link analysis based page ranking approaches have reached great success in commercial search engines (SE), the content based relevance computing approaches also play a very important role in the ranking of information retrieval results. Since most of existing relevance computing algorithms are running on the full text of a web page, this paper is focused on the relevance computing between user´s query and the auto-generated text summarization of each webpage. The first part of this paper provides a brief introduction of the state of art of relevance computing in SE. The inference network approach is especially concerned in this paper since it is the baseline method in our experiment SE system. Then the auto text summarization method based on multi-source integration is introduced, and the full text of each web page is replaced by its auto-generated abstract to compute the relevance between the webpage and user query. To evaluate the effect of the condensation representation of full text on the relevance based page rank of a system, several experiments are conducted in the last part of this paper, which include the method remarked above with different compress ratio, and the full text based ranking. In addition to the efficiency gain of the SE system, the experiment results also shows that the ranking results based on the summary generated by our text summarization system with 30% compress ratio can also get 11.29% of the precision improvement for the SE system.
  • Keywords
    Internet; content-based retrieval; search engines; text analysis; Web page; Web search ranking; autogenerated text summarization; commercial search engines; content based relevance computing approaches; full text based ranking; inference network; information retrieval; link analysis; page ranking approaches; text summarization system; user query; Computer science; Content based retrieval; Electronic mail; Information retrieval; Search engines; Text analysis; Web pages; Web search; Working environment noise; World Wide Web; Auto Text Summarization; Multi-sources Integration; Search Engine; Web Page Ranking;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
  • Conference_Location
    Montreal, Que.
  • Print_ISBN
    978-1-4244-0990-7
  • Electronic_ISBN
    978-1-4244-0991-4
  • Type

    conf

  • DOI
    10.1109/ICSMC.2007.4414122
  • Filename
    4414122