Title :
Unsupervised spoken term detection with acoustic segment model
Author :
Wang, Haipeng ; Lee, Tan ; Leung, Cheung-Chi
Author_Institution :
Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Hong Kong, China
Abstract :
This paper describes a study on query-by-example spoken term detection (STD) using the acoustic segment modeling technique. Acoustic segment models (ASMs) are a set of hidden Markov models (HMM) that are obtained in an unsupervised manner without using any transcription information. The training of ASMs follows an iterative procedure, which consists of the steps of initial segmentation, segments labeling, and HMM parameter estimation. The ASMs are incorporated into a template-matching framework for query-by-example STD. Both the spoken query examples and the test utterances are represented by frame-level ASM posteriorgrams. Segmental dynamic time warping (DTW) is applied to match the query with the test utterance and locate the possible occurrences. The performance of the proposed approach is evaluated with different DTW local distance measures on the TIMIT and the Fisher Corpora respectively. Experimental results show that the use of ASM posteriorgrams leads to consistently better performance of detection than the conventional GMM posteriorgrams.
Keywords :
database management systems; hidden Markov models; parameter estimation; query processing; DTW local distance measures; Fisher Corpora; HMM parameter estimation; TIMIT; acoustic segment modeling technique; frame-level ASM posteriorgrams; hidden Markov models; initial segmentation; iterative procedure; query-by-example spoken term detection; segmental dynamic time warping; segments labeling; template-matching framework; test utterance; Acoustic Segment Model; Posteriorgram; Query-by-Example; Unsupervised Spoken Term Detection;
Conference_Titel :
Speech Database and Assessments (Oriental COCOSDA), 2011 International Conference on
Conference_Location :
Hsinchu
Print_ISBN :
978-1-4577-0930-2
DOI :
10.1109/ICSDA.2011.6085989