Optimising Figure of Merit for phonetic spoken term detection

Author

Wallace, Roy ; Vogt, Robbie ; Baker, Brendan ; Sridharan, Sridha

Author_Institution

Speech & Audio Res. Lab., Queensland Univ. of Technol. (QUT), Brisbane, QLD, Australia

fYear

2010

fDate

14-19 March 2010

Firstpage

5298

Lastpage

5301

Abstract

This paper introduces a novel technique to directly optimise the Figure of Merit (FOM) for phonetic spoken term detection. The FOM is a popular measure of STD accuracy, making it an ideal candidate for use as an objective function. A simple linear model is introduced to transform the phone log-posterior probabilities output by a phone classifier to produce enhanced log-posterior features that are more suitable for the STD task. Direct optimisation of the FOM is then performed by training the parameters of this model using a nonlinear gradient descent algorithm. Substantial FOM improvements of 11% relative are achieved on held-out evaluation data, demonstrating the generalisability of the approach.

Keywords

probability; speech processing; speech recognition; figure of merit; phone log-posterior probabilities output; phonetic spoken term detection; simple linear model; speech recognition; Australia; Decoding; Error analysis; Indexing; Information retrieval; Laboratories; Magneto electrical resistivity imaging technique; Speech processing; Speech recognition; Viterbi algorithm; information retrieval; speech processing; speech recognition; spoken term detection;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on

Conference_Location

Dallas, TX

ISSN

1520-6149

Print_ISBN

978-1-4244-4295-9

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2010.5494969

Filename

5494969