مرکز منطقه ای اطلاع رساني علوم و فناوري - Leveraging speech production knowledge for improved speech recognition

DocumentCode :

2973123

Title :

Leveraging speech production knowledge for improved speech recognition

Author :

Sangwan, Abhijeet ; Hansen, John H L

Author_Institution :

Center for Robust Speech Syst. (CRSS), Univ. of Texas at Dallas (UTD), Richardson, TX, USA

fYear :

2009

fDate :

Nov. 13 2009-Dec. 17 2009

Firstpage :

Lastpage :

Abstract :

This study presents a novel phonological methodology for speech recognition based on phonological features (PFs) which leverages the relationship between speech phonology and phonetics. In particular, the proposed scheme estimates the likelihood of observing speech phonology given an associative lexicon. In this manner, the scheme is capable of choosing the most likely hypothesis (word candidate) among a group of competing alternative hypotheses. The framework employs the maximum entropy (ME) model to learn the relationship between phonetics and phonology. Subsequently, we extend the ME model to a ME-HMM (maximum entropy-hidden Markov model) which captures the speech production and linguistic relationship between phonology and words. The proposed ME-HMM model is applied to the task of re-processing N-best lists where an absolute WRA (word recognition rate) increase of 1.7%, 1.9% and 1% are reported for TIMIT, NTIMIT, and the SPINE (speech in noise) corpora (15.5% and 22.5% relative reduction in word error rate for TIMIT and NTIMIT).

Keywords :

hidden Markov models; maximum entropy methods; speech recognition; maximum entropy-hidden Markov model; phonological features; speech in noise corpora; speech phonetics; speech phonology; speech production knowledge; speech recognition; word recognition rate; Automatic speech recognition; Entropy; Error analysis; Government; Hidden Markov models; Noise reduction; Resonance; Robustness; Speech enhancement; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location :

Merano

Print_ISBN :

978-1-4244-5478-5

Electronic_ISBN :

978-1-4244-5479-2

Type :

conf

DOI :

10.1109/ASRU.2009.5373368

Filename :

5373368

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2973123