How to make more efficient use of the fact that the speech signal is dynamic and redundant

Author

Pols, Louis C W ; Plomp, Reinier

Author_Institution

University of Amsterdam, Soesterberg, The Netherlands

Volume

11

fYear

1986

fDate

31503

Firstpage

1963

Lastpage

1966

Abstract

Contrary to human speech perception, most speech-analysis and -processing techniques for front-end automatic speech recognition are based on discrete spectral analysis without implying temporal continuity, dynamic aspects, or context dependency. Speech recognition is presently primarily based on one specific signal aspect at a time, whereas one should make more efficient use of parallel information channels handling multiple cues. The actual global and local conditions should specify which single or combined set of parameters is important at that moment. Dynamic information is worth to receive more attention. The human ear actually seems to be ´overpowered´ to process good quality speech; only under critical conditions the full analyzing capability of the ear and all redundant information in the speech signal have to be used. Automatic speech recognition systems not yet have this ´overcapacity´ and therefore should rely more on multiple, highly resistant cues.

Keywords

Automatic speech recognition; Ear; Humans; Information analysis; Signal analysis; Signal processing; Spectral analysis; Speech analysis; Speech processing; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86.

Type

conf

DOI

10.1109/ICASSP.1986.1168647

Filename

1168647