DocumentCode
3039935
Title
The research of theme identification in scientific documents
Author
Chunlei, Ye ; Lu, Feng
Author_Institution
Nat. Sci. of Libr., Beijing, China
Volume
3
fYear
2012
fDate
25-27 May 2012
Firstpage
715
Lastpage
718
Abstract
There is abundant thematic information in the technical documentations which can reveal the content of the subject. Co-word analysis is an important method for Scientometrics analysis. And the theme clustering analysis based on co-word has become one of the most active research fields. Co-word clustering analysis forms a series of paper clusters which consists of scientific and technological documents. These theme clustering reflect the evolution of the development trend which contribute to grasp the development of science for researchers. So, It is necessary to identify the theme of these clusters. This paper analyses some typical approaches of theme identification in co-word analysis and their drawbacks, and advances an improved method that combines Latent Dirichlet Allocation model for theme identification. The experimental results prove that the advanced approach can utilize the merits of improved co-word analysis, especially in enhancing the thematic characteristic and coherency among the descriptors. And thus the advanced approach can be better used in theme identification of scientific documents.
Keywords
document handling; pattern clustering; scientific information systems; coword clustering analysis; latent Dirichlet allocation model; scientific documents; scientometrics analysis; technical documentations; thematic information; theme clustering analysis; theme identification; Algorithm design and analysis; Cities and towns; Educational institutions; Indexes; Software; Software engineering; Latent Dirichlet Allocation; cluster analysis; co-word; theme identification;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Automation Engineering (CSAE), 2012 IEEE International Conference on
Conference_Location
Zhangjiajie
Print_ISBN
978-1-4673-0088-9
Type
conf
DOI
10.1109/CSAE.2012.6273049
Filename
6273049
Link To Document