• DocumentCode
    2275930
  • Title

    An Automatic Online News Topic Keyphrase Extraction System

  • Author

    Wang, Canhui ; Zhang, Min ; Ru, Liyun ; Ma, Shaoping

  • Author_Institution
    CS & T Dept., Tsinghua Univ., Beijing
  • Volume
    1
  • fYear
    2008
  • fDate
    9-12 Dec. 2008
  • Firstpage
    214
  • Lastpage
    219
  • Abstract
    News Topics are related to a set of keywords or keyphrases. Topic keyphrases briefly describe the key content of topics and help users decide whether to do further reading about them. Moreover, keyphrases of a news topic can be considered as a cluster of related terms, which provides term relationship information that can be integrated into information retrieval models. In this paper, an automatic online news topic keyphrase extraction system is proposed. News stories are organized into topics. Keyword candidates are firstly extracted from single news stories and filtered with topic information. Then a phrase identification process combines keywords into phrases using position information. Finally, the phrases are ranked and top ones are selected as topic keyphrases. Experiments performed on practical Web datasets show that the proposed system works effectively, with a performance of precision=70.61% and recall=67.94%.
  • Keywords
    information filtering; information resources; information retrieval systems; automatic online news topic keyphrase extraction system; information filtering; information retrieval; news story; phrase identification process; Data mining; Indexing; Information filtering; Information filters; Information retrieval; Information science; Intelligent agent; Intelligent systems; Laboratories; Search engines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Sydney, NSW
  • Print_ISBN
    978-0-7695-3496-1
  • Type

    conf

  • DOI
    10.1109/WIIAT.2008.225
  • Filename
    4740452