DocumentCode
2664976
Title
Online discovery of relevant terms from Internet
Author
Donghong, Ji ; Lingpeng, Yang ; Yu, Nie ; Li, Tang
Author_Institution
Inst. for Infocomm Res., Singapore, Singapore
fYear
2003
fDate
26-29 Oct. 2003
Firstpage
327
Lastpage
332
Abstract
We propose a fast method to acquire relevant terms from Internet. For any search term, the text summaries in the hit list returned by search engines may contain most, if not all, significant terms relevant with it, furthermore, these terms are very likely to be prominent in the text summaries. This leaves the possibility for them to be identified from the summaries. To do so, we adopt a kind of seeding-and-expansion strategy, which first locates some seed words and then expands from them to get the terms. Compared with other methods, this one makes use of Internet as a kind of dynamic corpus, which, combining with search engines, forms an ideal resource for relevant term extraction due to its huge content and updating feature. On the other hand, the method seeks to serve online applications by reducing large statistical data through the seeding-and-expansion strategy.
Keywords
Internet; data mining; search engines; text analysis; Internet; Web mining; dynamic corpus; information extraction; knowledge discovery; relevant term extraction; search engines; Buildings; Data mining; Encyclopedias; Frequency; Indexing; Internet; Natural language processing; Portals; Search engines; Web mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location
Beijing, China
Print_ISBN
0-7803-7902-0
Type
conf
DOI
10.1109/NLPKE.2003.1275924
Filename
1275924
Link To Document