DocumentCode
2010922
Title
Integrating acoustic and lexical features in topic segmentation of Chinese broadcast news using maximum entropy approach
Author
Xie, Lei ; Yang, Yulian ; Liu, Zhi-Qiang ; Feng, Wei ; Liu, Zihan
Author_Institution
Sch. of Comput. Sci., Northwestern Polytech. Univ., Xi´´an, China
fYear
2010
fDate
23-25 Nov. 2010
Firstpage
407
Lastpage
413
Abstract
This paper studies how to integrate multi-modal features in automatic topic segmentation of Mandarin broadcast news. The multi-modal feature integration problem is formulated within the Maximum Entropy (MaxEnt) scheme for topic boundary classification by maximizing the entropy and respecting all known constraints (i.e., multiple features contributions). We particularly consider two types of features: (1) acoustic features, which reflect the editorial prosody of broadcast news, including pause duration, speaker change and speech type; and (2) lexical features extracted from speech recognition transcripts, which capture the semantic shifts of topics, including two local cohesiveness features and a new boundary indicator based on overall cohesiveness. Compared to local lexical features, the new overall cohesiveness feature maximizes the lexical cohesiveness of all topic fragments and reflects the fact that topic transitions in broadcast news are smooth and the distributional variations are subtle. Experiments show apparent performance improvement in topic segmentation of Chinese broadcast news by fusing acoustic and lexical features within the MaxEnt scheme.
Keywords
feature extraction; image segmentation; maximum entropy methods; speech recognition; Chinese broadcast news; Mandarin broadcast news; MaxEnt scheme; acoustic feature; boundary indicator; editorial prosody; lexical cohesiveness feature; lexical feature extraction; maximum entropy approach; maximum entropy scheme; multimodal feature integration; pause duration; speaker change; speech recognition; speech type; topic boundary classification; topic segmentation; Editorials; Entropy; Feature extraction; Music; Speech; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Audio Language and Image Processing (ICALIP), 2010 International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-4244-5856-1
Type
conf
DOI
10.1109/ICALIP.2010.5684551
Filename
5684551
Link To Document