DocumentCode
117906
Title
Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news
Author
Guangpu Huang ; Chenglin Xu ; Xiong Xiao ; Lei Xie ; Eng Siong Chng ; Haizhou Li
Author_Institution
Temasek Labs. @NLU, Singapore, Singapore
fYear
2014
fDate
9-12 Dec. 2014
Firstpage
1
Lastpage
9
Abstract
This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We tested the accuracy of the multi-view features on the standard NIST RT-04 English broadcast news speech data. Experiments show that the best system outperforms the state-of-the-art sentence unit detection system significantly by 13.2% absolute NIST sentence error rate reduction using the reference transcription. However, the performance gain is limited on the recognized transcription partly due to the high word error rate.
Keywords
feature extraction; natural language processing; neural nets; speech recognition; DNN-CRF model; NIST RT-04 english broadcast news speech data; NIST sentence error rate reduction; acoustic domain; articulatory domain; deep neural network conditional random field system; linguistic domain; multiview feature extraction; sentence unit detection improvement; word error rate; Acoustics; Feature extraction; Hidden Markov models; Pragmatics; Speech; Tongue; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
Conference_Location
Siem Reap
Type
conf
DOI
10.1109/APSIPA.2014.7041543
Filename
7041543
Link To Document