Benchmarking human performance for continuous speech recognition

Author

Deshmukh, Neeraj ; Duncan, Richard Jennings ; Ganapathiraju, Aravind ; Picone, Joseph

Author_Institution

Inst. for Signal & Inf. Process., Mississippi State Univ., MS, USA

Volume

4

fYear

1996

fDate

3-6 Oct 1996

Firstpage

2486

Abstract

It is a well-established fact that human performance exceeds that of computers by orders of magnitude on a wide range of speech recognition tasks. However, there is widespread belief that the gap between human and machine performance has narrowed considerably on restricted problems. Yet, there are few extensive comparisons of performance on tasks involving large vocabulary continuous speech recognition (LVCSR) and low signal-to-noise ratios (SNRs). Human evaluations on LVCSR tasks highlight a number of interesting issues. For example, familiarity with the domain plays a crucial role in human performance. The authors conducted several experiments that extensively characterize human performance on LVCSR tasks over two standard evaluation corpora-ARPA´s CSR´94 Spoke 10 and CSR´95 Hub 3. They demonstrate that human performance is at least an order of magnitude better than the best machine performance, and that human performance is fairly robust to a number of factors that typically degrade machine performance: SNR, speaking rate and style, microphone and ambient noise. In fact, human performance remained remarkably consistent across evaluation paradigms, and to some extent was artificially limited by a listener´s attention span

Keywords

human factors; speech recognition; ARPA CSR´94 Spoke 10 corpus; ARPA CSR´95 Hub 3 corpus; ambient noise; domain familiarity; human performance benchmarking; large vocabulary continuous speech recognition; listener attention span; low signal-to-noise ratios; machine performance; microphone noise; speaking rate; speaking style; Benchmark testing; Conducting materials; Degradation; Humans; Materials testing; Microphones; Signal processing; Signal to noise ratio; Speech recognition; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on

Conference_Location

Philadelphia, PA

Print_ISBN

0-7803-3555-4

Type

conf

DOI

10.1109/ICSLP.1996.607317

Filename

607317