DocumentCode :
3165514
Title :
Improving Knowledge Discovery in Document Collections through Combining Text Retrieval and Link Analysis Techniques
Author :
Jin, Wei ; Srihari, Rohini K. ; Ho, Hung Hay ; Wu, Xin
Author_Institution :
State Univ. of New York, Buffalo
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
193
Lastpage :
202
Abstract :
In this paper, we present Concept Chain Queries (CCQ), a special case of text mining in document collections focusing on detecting links between two topics across text documents. We interpret such a query as finding the most meaningful evidence trails across documents that connect these two topics. We propose to use link-analysis techniques over the extracted features provided by Information Extraction Engine for finding new knowledge. A graphical text representation and mining model is proposed which combines information retrieval, association mining and link analysis techniques. We present experiments on different datasets that demonstrate the effectiveness of our algorithm. Specifically, the algorithm generates ranked concept chains and evidence trails where the key terms representing significant relationships between topics are ranked high.
Keywords :
data mining; document handling; information retrieval; natural language processing; concept chain queries; document collections; graphical text representation; information extraction engine; knowledge discovery; link analysis techniques; text documents; text mining; text retrieval; Computer science; Data engineering; Data mining; Engines; Feature extraction; Information analysis; Information retrieval; Knowledge engineering; Text mining; USA Councils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.62
Filename :
4470243
Link To Document :
بازگشت