DocumentCode :
1532194
Title :
Uncertainty Reduction for Knowledge Discovery and Information Extraction on the World Wide Web
Author :
Ji, Heng ; Deng, Hongbo ; Han, Jiawei
Author_Institution :
Department of Computer Science, City University of New York, New York City, NY, USA
Volume :
100
Issue :
9
fYear :
2012
Firstpage :
2658
Lastpage :
2674
Abstract :
In this paper, we give an overview of knowledge discovery (KD) and information extraction (IE) techniques on the World Wide Web (WWW). We intend to answer the following questions: What kind of additional uncertainty challenges are introduced by the WWW setting to basic KD and IE techniques? What are the fundamental techniques that can be used to reduce such uncertainty and achieve reasonable KD and IE performance on the WWW? What is the impact of each novel method? What types of interactions can be conducted between these techniques and information networks to make them benefit from each other? In what way can we utilize the results in more interesting applications? What are the remaining challenges and what are the possible ways to address these challenges? We hope this can provide a road map to advance KD and IE on the WWW to a higher level of performance, portability and utilization.
Keywords :
Analytical models; Hidden Markov models; Natural language processing; Text mining; Text processing; Uncertainty; World Wide Web; natural language processing; text analysis; text mining;
fLanguage :
English
Journal_Title :
Proceedings of the IEEE
Publisher :
ieee
ISSN :
0018-9219
Type :
jour
DOI :
10.1109/JPROC.2012.2190489
Filename :
6212297
Link To Document :
بازگشت