Title :
Rescoring Confusion Networks for Keyword Search
Author :
Soto, Victor ; Cooper, Erica ; Mangu, Lidia ; Rosenberg, Andrew ; Hirschberg, Julia
Abstract :
We introduce a two-stage cascaded scheme to rescore Confusion Networks (CNs) for Keyword Search in the context of Low-Resource Languages. In the first stage we rescore the CN to improve the error rate of the 1-best hypothesis using a large number of lexical, phonetic, false alarms and structural features. Using a rank learning Support Vector Machine classifier, we obtain WER gains between 0.54% and 2.84% on Cantonese, Tagalog, Turkish, Pashto and Vietnamese. In the second stage we generate keyword hits from the rescored CN and use logistic regression to detect true hits and false alarms. We compare these to hits generated from the unrescored CN and obtain gains between 0.45% and 0.9% on the MTWV metric by using the mentioned features and including acoustic and prosodic features on Tagalog, Turkish and Pashto.
Keywords :
error correction; error detection; natural language processing; regression analysis; speech recognition; support vector machines; 1-best hypothesis; Cantonese; MTWV metric; Pashto; Tagalog; Turkish; Vietnamese; WER gains; confusion network rescoring; error rate; false alarms; keyword search; logistic regression; low-resource languages; structural features; support vector machine classifier; two-stage cascaded scheme; Acoustics; Feature extraction; Keyword search; Lattices; Speech; Speech recognition; Standards; confusion networks; error correction; error detection; keyword search; posting lists; rescoring;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854975