مرکز منطقه ای اطلاع رساني علوم و فناوري - Highly accurate phonetic segmentation using boundary correction models and system fusion

DocumentCode :

179579

Title :

Highly accurate phonetic segmentation using boundary correction models and system fusion

Author :

Stolcke, Andreas ; Ryant, Neville ; Mitra, Ved ; Jiahong Yuan ; Wen Wang ; Liberman, Mark

Author_Institution :

Microsoft Res., Mountain View, CA, USA

fYear :

2014

fDate :

4-9 May 2014

Firstpage :

5552

Lastpage :

5556

Abstract :

Accurate phone-level segmentation of speech remains an important task for many subfields of speech research. We investigate techniques for boosting the accuracy of automatic phonetic segmentation based on HMM acoustic-phonetic models. In prior work [25] we were able to improve on state-of-the-art alignment accuracy by employing special phone boundary HMM models, trained on phonetically segmented training data, in conjunction with a simple boundary-time correction model. Here we present further improved results by using more powerful statistical models for boundary correction that are conditioned on phonetic context and duration features. Furthermore, we find that combining multiple acoustic front-ends gives additional gains in accuracy, and that conditioning the combiner on phonetic context and side information helps. Overall, we reduce segmentation errors on the TIMIT corpus by almost one half, from 93.9% to 96.8% boundary accuracy with a 20-ms tolerance.

Keywords :

hidden Markov models; speech processing; statistical analysis; HMM acoustic-phonetic models; TIMIT; acoustic front-ends; automatic phonetic segmentation; boundary correction models; boundary-time correction model; phone boundary HMM models; phone-level segmentation; phonetic context; phonetic segmentation; phonetically segmented training data; segmentation errors; speech research; statistical models; system fusion; Acoustics; Conferences; Decision support systems; Speech; Speech processing; HMM; forced alignment; phone boundary model; phonetic segmentation; regression; system fusion;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location :

Florence

Type :

conf

DOI :

10.1109/ICASSP.2014.6854665

Filename :

6854665

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=179579