Exploiting diversity for spoken term detection

Author

Mangu, Lidia ; Soltau, Hagen ; Hong-Kwang Kuo ; Kingsbury, Brian ; Saon, George

Author_Institution

IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA

fYear

2013

Firstpage

8282

Lastpage

8286

Abstract

The paper describes a state-of-the-art spoken term detection system in which significant improvements are obtained by diversifying the ASR engines used for indexing and combining the search results. First, we describe the design factors that, when varied, produce complementary STD systems and show that the performance of the combined system is 3 times better than the best individual component. Next, we describe different strategies for system combination and show that significant improvements can be achieved by normalizing the combined scores. We propose a classifier-based system combination strategy which outperforms a highly optimized baseline. The system described in this paper had the highest accuracy in the 2012 DARPA RATS evaluation.

Keywords

pattern classification; speech recognition; 2012 DARPA RATS evaluation; ASR engine; classifier-based system combination strategy; complementary STD system; indexing; spoken term detection system; Acoustics; Hidden Markov models; Indexes; Lattices; Rats; Speech; Training; audio indexing; keyword spotting; spoken term detection; system combination;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location

Vancouver, BC

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2013.6639280

Filename

6639280