DocumentCode :
2664976
Title :
Online discovery of relevant terms from Internet
Author :
Donghong, Ji ; Lingpeng, Yang ; Yu, Nie ; Li, Tang
Author_Institution :
Inst. for Infocomm Res., Singapore, Singapore
fYear :
2003
fDate :
26-29 Oct. 2003
Firstpage :
327
Lastpage :
332
Abstract :
We propose a fast method to acquire relevant terms from Internet. For any search term, the text summaries in the hit list returned by search engines may contain most, if not all, significant terms relevant with it, furthermore, these terms are very likely to be prominent in the text summaries. This leaves the possibility for them to be identified from the summaries. To do so, we adopt a kind of seeding-and-expansion strategy, which first locates some seed words and then expands from them to get the terms. Compared with other methods, this one makes use of Internet as a kind of dynamic corpus, which, combining with search engines, forms an ideal resource for relevant term extraction due to its huge content and updating feature. On the other hand, the method seeks to serve online applications by reducing large statistical data through the seeding-and-expansion strategy.
Keywords :
Internet; data mining; search engines; text analysis; Internet; Web mining; dynamic corpus; information extraction; knowledge discovery; relevant term extraction; search engines; Buildings; Data mining; Encyclopedias; Frequency; Indexing; Internet; Natural language processing; Portals; Search engines; Web mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location :
Beijing, China
Print_ISBN :
0-7803-7902-0
Type :
conf
DOI :
10.1109/NLPKE.2003.1275924
Filename :
1275924
Link To Document :
بازگشت