DocumentCode :
3752214
Title :
Rescoring by a deep neural network for spoken term detection
Author :
Ryota Konno;Kazunori Kojima;Kazuyo Tanaka;Shi-wook Lee;Yoshiaki Itoh
Author_Institution :
Iwate Prefectural University, Japan
fYear :
2015
Firstpage :
1207
Lastpage :
1211
Abstract :
In spoken-term detection (STD), the detection of out-of-vocabulary (OOV) query terms is crucial because query terms are likely to be OOV terms. This paper proposes a rescoring method that uses the posterior probabilities output by a deep neural network (DNN) to improve detection accuracy for OOV query terms. Conventional STD methods for OOV query terms search a query subword sequence for subword sequences of speech data by using an automatic speech recognizer. A detailed matching in the proposed method is performed by using the probabilities output by the DNN. A pseudo query at the frame or state level is generated so as to align the obtained probability at the frame level. To reduce the computational burden on the DNN, we apply the proposed method to only top candidate utterances, which can be quickly found by a conventional STD method. Experiments were conducted to evaluate the performance of the proposed method, using the open test collections for the SpokenDoc tasks of the NTCIR-9 and NTCIR-10 workshops as benchmarks. The proposed method improved the mean average precision between 5 and 20 points, surpassing the best accuracy obtained at the workshops. These results demonstrated the effectiveness of the proposed method.
Keywords :
"Hidden Markov models","Speech recognition","Speech","Conferences","Acoustics","Probability","Neural networks"
Publisher :
ieee
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015 Asia-Pacific
Type :
conf
DOI :
10.1109/APSIPA.2015.7415465
Filename :
7415465
Link To Document :
بازگشت