DocumentCode :
686514
Title :
Auto-clustering of conversation corpus based on syntactic, semantic and pragmatic features
Author :
Baojian Chen ; Minghu Jiang
Author_Institution :
Sch. of Humanities, Tsinghua Univ., Beijing, China
fYear :
2013
fDate :
22-258 Nov. 2013
Firstpage :
295
Lastpage :
300
Abstract :
To understand natural language accurately, we not only need to do natural language morphology and syntactic analysis, but also need to combine semantic knowledge and pragmatic information with a specific context. Due to short knowledge and lack in background information of conversation corpus which related to the pragmatic, there is a long way to go for computer fully understand natural language. In this paper, the pragmatic features were added to the text vector space model of language spoken conversation, and hierarchical clustering is executed. Our experimental results show that the clustering effect with pragmatic features outperforms than non-pragmatic features, and precision, recall rate and F values of the former were increased by 6.67%, 6.34% and 6.6%, respectively. It indicates that pragmatic information has played an important role in enhancing the effect of the text clustering.
Keywords :
natural language processing; pattern clustering; programming language semantics; text analysis; conversation corpus auto-clustering; hierarchical clustering; language spoken conversation; natural language morphology; pragmatic feature; pragmatic information; semantic feature; semantic knowledge; syntactic analysis; syntactic feature; text clustering effect enhancement; text vector space model; hierarchical clustering; pragmatic features; text vector space mode;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Wireless, Mobile and Multimedia Networks (ICWMMN 2013), 5th IET International Conference on
Conference_Location :
Beijing
Electronic_ISBN :
978-1-84919-726-7
Type :
conf
DOI :
10.1049/cp.2013.2428
Filename :
6827845
Link To Document :
بازگشت