DocumentCode
2330663
Title
Exploiting semantic associative information in topic modeling
Author
Wu, Meng-Sung ; Lee, Hung-Shin ; Wang, Hsin-Min
Author_Institution
Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan
fYear
2010
fDate
12-15 Dec. 2010
Firstpage
384
Lastpage
388
Abstract
Topic modeling has been widely applied in a variety of text modeling tasks as well as in speech recognition systems for effectively capturing the semantic and statistic information in documents or speech utterances. Most topic models rely on the bag-of-words assumption that results in learned latent topics composed of lists of individual words. Unfortunately, these words may convey topical information but lack accurate semantic knowledge of the text. In this paper, we present the semantic associative topic model, where the concept of the semantic association terms is extended to topic modeling, which provides guidance on modeling the semantic associations that occur among single words by expressing a document as an association of multiple words. Further, the pointwise KL-divergence metric is used to measure the significance of the association. We also integrate original PLSA and SATM models, which have mixed feature representations. Experimental results on WSJ and AP datasets show that the proposed approaches achieved higher performance compared to other methods.
Keywords
natural language processing; speech recognition; statistical analysis; KL-divergence metric; bag-of-words assumption; exploiting semantic associative information; semantic information; semantic knowledge; speech recognition systems; speech utterances; statistic information; text modeling; topic modeling; information retrieval; language model; semantic association; topic model;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location
Berkeley, CA
Print_ISBN
978-1-4244-7904-7
Electronic_ISBN
978-1-4244-7902-3
Type
conf
DOI
10.1109/SLT.2010.5700883
Filename
5700883
Link To Document