• DocumentCode
    395227
  • Title

    Multi-stream processing using context-independent and context-dependent hybrid systems

  • Author

    Hagen, Astrid ; Neto, João P.

  • Author_Institution
    Spoken Language Syst. Lab., INESC, Lisbon, Portugal
  • Volume
    2
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    Multi-stream processing provides a successful approach to enhance the generalization capability of a recognizer and can, moreover, be combined with other robust techniques, such as spectral subtraction and/or robust features, HMM/MLP hybrid systems, and others. The question usually arises at which point the different streams are to be recombined, i.e. at the feature or at the probability level. Feature and probability combination are often seen as alternative approaches. We show here how a sensitive combination of both renders this decision obsolete and improves recognition as compared to each approach carried out on its own. The study has been carried out on the digits and numbers part of the Portuguese SPEECHDAT corpus. This corpus includes a large number of speakers and channel conditions and is, thus, well suited to test the described multi-stream systems under realistic conditions. Results are presented for both context-independent and context-dependent models used in an HMM/MLP hybrid system.
  • Keywords
    hidden Markov models; multilayer perceptrons; speech processing; speech recognition; HMM/MLP hybrid system; HMM/MLP hybrid systems; Portuguese SPEECHDAT corpus; channel conditions; context-dependent hybrid systems; context-dependent models; context-independent hybrid systems; context-independent models; digit recognition; hidden Markov model; multi-stream processing; multilayer perceptron; number recognition; probability level; robust features; spectral subtraction; speech recognition; telephone line; Automatic speech recognition; Context modeling; Hidden Markov models; Laboratories; Natural languages; Robustness; Spatial databases; Speech recognition; Streaming media; System testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1202348
  • Filename
    1202348