DocumentCode :
117906
Title :
Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news
Author :
Guangpu Huang ; Chenglin Xu ; Xiong Xiao ; Lei Xie ; Eng Siong Chng ; Haizhou Li
Author_Institution :
Temasek Labs. @NLU, Singapore, Singapore
fYear :
2014
fDate :
9-12 Dec. 2014
Firstpage :
1
Lastpage :
9
Abstract :
This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We tested the accuracy of the multi-view features on the standard NIST RT-04 English broadcast news speech data. Experiments show that the best system outperforms the state-of-the-art sentence unit detection system significantly by 13.2% absolute NIST sentence error rate reduction using the reference transcription. However, the performance gain is limited on the recognized transcription partly due to the high word error rate.
Keywords :
feature extraction; natural language processing; neural nets; speech recognition; DNN-CRF model; NIST RT-04 english broadcast news speech data; NIST sentence error rate reduction; acoustic domain; articulatory domain; deep neural network conditional random field system; linguistic domain; multiview feature extraction; sentence unit detection improvement; word error rate; Acoustics; Feature extraction; Hidden Markov models; Pragmatics; Speech; Tongue; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
Conference_Location :
Siem Reap
Type :
conf
DOI :
10.1109/APSIPA.2014.7041543
Filename :
7041543
Link To Document :
بازگشت