• DocumentCode
    2664976
  • Title

    Online discovery of relevant terms from Internet

  • Author

    Donghong, Ji ; Lingpeng, Yang ; Yu, Nie ; Li, Tang

  • Author_Institution
    Inst. for Infocomm Res., Singapore, Singapore
  • fYear
    2003
  • fDate
    26-29 Oct. 2003
  • Firstpage
    327
  • Lastpage
    332
  • Abstract
    We propose a fast method to acquire relevant terms from Internet. For any search term, the text summaries in the hit list returned by search engines may contain most, if not all, significant terms relevant with it, furthermore, these terms are very likely to be prominent in the text summaries. This leaves the possibility for them to be identified from the summaries. To do so, we adopt a kind of seeding-and-expansion strategy, which first locates some seed words and then expands from them to get the terms. Compared with other methods, this one makes use of Internet as a kind of dynamic corpus, which, combining with search engines, forms an ideal resource for relevant term extraction due to its huge content and updating feature. On the other hand, the method seeks to serve online applications by reducing large statistical data through the seeding-and-expansion strategy.
  • Keywords
    Internet; data mining; search engines; text analysis; Internet; Web mining; dynamic corpus; information extraction; knowledge discovery; relevant term extraction; search engines; Buildings; Data mining; Encyclopedias; Frequency; Indexing; Internet; Natural language processing; Portals; Search engines; Web mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
  • Conference_Location
    Beijing, China
  • Print_ISBN
    0-7803-7902-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2003.1275924
  • Filename
    1275924