Title :
Using rhythmic features for Japanese spoken term detection
Author :
Kanda, Natsuki ; Takeda, Ryu ; Obuchi, Yasunari
Author_Institution :
Central Res. Lab., Hitachi Ltd., Koganei, Japan
Abstract :
A new rescoring method for spoken term detection (STD) is proposed. Phoneme-based close-matching techniques have been used because of their ability to detect out-of-vocabulary (OOV) queries. To improve the accuracy of phoneme-based techniques, rescoring techniques have been used to accurately re-rank the results from phoneme-based close-matching; however, conventional rescoring techniques based on an utterance verification model still produce many false detection results. To further improve the accuracy, in this study, several features representing the “naturalness” (or “abnormality”) of duration of phonemes/syllables in detected candidates of a keyword are proposed. These features are incorporated into a conventional rescoring technique using logistic regression. Experimental results with a 604-hour Japanese speech corpus indicated that combining the rhythmic features achieved a further relative error reduction of 8.9% compared to a conventional rescoring technique.
Keywords :
natural language processing; query processing; regression analysis; speech processing; speech recognition; Japanese STD; Japanese speech corpus; Japanese spoken term detection; OOV query detection; false detection; logistic regression; out-of-vocabulary query detection; phoneme duration abnormality; phoneme duration naturalness; phoneme-based close-matching technique accuracy improvement; relative error reduction; rescoring method; rhythmic features; syllable duration abnormality; syllable duration naturalness; utterance verification model; Accuracy; Acoustics; Feature extraction; Hidden Markov models; Logistics; Probability; Speech; speech recognition; spoken document retrieval; spoken term detection; utterance verification;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2012 IEEE
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4673-5125-6
Electronic_ISBN :
978-1-4673-5124-9
DOI :
10.1109/SLT.2012.6424217