مرکز منطقه ای اطلاع رساني علوم و فناوري - Speech Recognition With Flat Direct Models

DocumentCode :

1335703

Title :

Speech Recognition With Flat Direct Models

Author :

Nguyen, Patrick ; Heigold, Georg ; Zweig, Geoffrey

Author_Institution :

Microsoft Res., Redmond, WA, USA

Volume :

Issue :

fYear :

2010

Firstpage :

994

Lastpage :

1006

Abstract :

This paper describes a novel direct modeling approach for speech recognition. We propose a log-linear modeling framework based on using numerous features which each measure some form of consistency between the underlying speech and an entire sequence of hypothesized words. Since the model relates the entire audio signal to a complete hypothesis without necessarily positing any inherent structure, we term this a flat direct model (FDM). In contrast to a conventional hidden Markov model approach, no Markov assumptions are used, and the model is not necessarily sequential. We demonstrate the use of features based on both template-matching distances, and the acoustic detection of multi-phone units which are selected so as to have maximal mutual information with respect to word labels. Further, we solve the key problem of how to define features which can generalize to unseen word sequences. In the proposed model, template-based features improve sentence error rate by 3% absolute over the baseline, while multi-phone-based features improve by 2% absolute.

Keywords :

acoustic signal detection; hidden Markov models; pattern matching; speech recognition; acoustic detection; flat direct model; hidden Markov model; hypothesized word sequence; log-linear modeling framework; multiphone unit; speech recognition; template-based feature; template-matching distance; Acoustics; Feature extraction; Hidden Markov models; Markov processes; Mutual information; Speech recognition; Statistical learning; Direct model; features; log-linear model; maximum mutual information (MMI); speech recognition;

fLanguage :

English

Journal_Title :

Selected Topics in Signal Processing, IEEE Journal of

Publisher :

ieee

ISSN :

1932-4553

Type :

jour

DOI :

10.1109/JSTSP.2010.2080812

Filename :

5585806

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1335703