DocumentCode :
1798417
Title :
Chunks of thought: Finding salient semantic structures in texts
Author :
Mei Mei ; Vanarase, Aashay ; Minai, Ali A.
Author_Institution :
Dept. of Electr. Eng. & Comput. Syst., Univ. of Cincinnati, Cincinnati, OH, USA
fYear :
2014
fDate :
6-11 July 2014
Firstpage :
3958
Lastpage :
3965
Abstract :
As the availability of large, digital text corpora increases, so does the need for automatic methods to analyze them and to extract significant information from them. A number of algorithms have been developed for these applications, with topic modeling-based algorithms such as latent Dirichlet allocation (LDA) enjoying much recent popularity. In this paper, we focus on a specific but important problem in text analysis: Identifying coherent lexical combinations that represent "chunks of thought" within the larger discourse. We term these salient semantic chunks (SSCs), and present two complimentary approaches for their extraction. Both these approaches derive from a cognitive rather than purely statistical perspective on the generation of texts. We apply the two algorithms to a corpus of abstracts from IJCNN 2009, and show that both algorithms find meaningful chunks that elucidate the semantic structure of the corpus in complementary ways.
Keywords :
information retrieval; text analysis; IJCNN 2009; LDA; SSC; abstracts corpus; cognitive perspective; coherent lexical combinations; digital text corpora; latent Dirichlet allocation; salient semantic chunks; salient semantic structures; topic modeling-based algorithms; Abstracts; Algorithm design and analysis; Brain models; Computational modeling; Frequency measurement; Semantics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), 2014 International Joint Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-6627-1
Type :
conf
DOI :
10.1109/IJCNN.2014.6889944
Filename :
6889944
Link To Document :
بازگشت