Title :
Exploiting diversity for spoken term detection
Author :
Mangu, Lidia ; Soltau, Hagen ; Hong-Kwang Kuo ; Kingsbury, Brian ; Saon, George
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
The paper describes a state-of-the-art spoken term detection system in which significant improvements are obtained by diversifying the ASR engines used for indexing and combining the search results. First, we describe the design factors that, when varied, produce complementary STD systems and show that the performance of the combined system is 3 times better than the best individual component. Next, we describe different strategies for system combination and show that significant improvements can be achieved by normalizing the combined scores. We propose a classifier-based system combination strategy which outperforms a highly optimized baseline. The system described in this paper had the highest accuracy in the 2012 DARPA RATS evaluation.
Keywords :
pattern classification; speech recognition; 2012 DARPA RATS evaluation; ASR engine; classifier-based system combination strategy; complementary STD system; indexing; spoken term detection system; Acoustics; Hidden Markov models; Indexes; Lattices; Rats; Speech; Training; audio indexing; keyword spotting; spoken term detection; system combination;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639280