Multi-stream processing using context-independent and context-dependent hybrid systems

Author

Hagen, Astrid ; Neto, João P.

Author_Institution

Spoken Language Syst. Lab., INESC, Lisbon, Portugal

Volume

2

fYear

2003

fDate

6-10 April 2003

Abstract

Multi-stream processing provides a successful approach to enhance the generalization capability of a recognizer and can, moreover, be combined with other robust techniques, such as spectral subtraction and/or robust features, HMM/MLP hybrid systems, and others. The question usually arises at which point the different streams are to be recombined, i.e. at the feature or at the probability level. Feature and probability combination are often seen as alternative approaches. We show here how a sensitive combination of both renders this decision obsolete and improves recognition as compared to each approach carried out on its own. The study has been carried out on the digits and numbers part of the Portuguese SPEECHDAT corpus. This corpus includes a large number of speakers and channel conditions and is, thus, well suited to test the described multi-stream systems under realistic conditions. Results are presented for both context-independent and context-dependent models used in an HMM/MLP hybrid system.

Keywords

hidden Markov models; multilayer perceptrons; speech processing; speech recognition; HMM/MLP hybrid system; HMM/MLP hybrid systems; Portuguese SPEECHDAT corpus; channel conditions; context-dependent hybrid systems; context-dependent models; context-independent hybrid systems; context-independent models; digit recognition; hidden Markov model; multi-stream processing; multilayer perceptron; number recognition; probability level; robust features; spectral subtraction; speech recognition; telephone line; Automatic speech recognition; Context modeling; Hidden Markov models; Laboratories; Natural languages; Robustness; Spatial databases; Speech recognition; Streaming media; System testing;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-7663-3

Type

conf

DOI

10.1109/ICASSP.2003.1202348

Filename

1202348