Review of AMR speech codec-and distributed speech recognition-based speech-enabled services

Author

Kiss, Imre ; Lakaniem, Ari ; Yang, Cao ; Viikki, ONi

Author_Institution

Audio-Visual Syst. Lab., Nokia Res. Center, Tampere, Finland

fYear

2003

fDate

30 Nov.-3 Dec. 2003

Firstpage

613

Lastpage

618

Abstract

In this paper, we investigate the usefulness of general-purpose speech codecs and dedicated speech recognition codecs for speech-enabled services. Specifically, we focus on 3rd generation WCDMA systems using the adaptive multi-rate (AMR) speech codec, in comparison with the distributed speech recognition (DSR) framework. Speech recognition experiments are carried out with the AMR speech codec in a simulated packet-switched network. The performance of the DSR codec is assumed to be unaffected by transmission errors. Experimental results in British English and Mandarin Chinese indicate that no significant performance difference can be observed between the AMRand DSR-based recognition systems. The gain from using the dedicated DSR codec is unlikely to provide a perceptible improvement in terms of quality of service for the end-users. In the light of the experimental results achieved, and other implementation and economical issues, it is concluded that the use of dedicated speech recognition codecs, such as DSR, does not offer tangible benefits in real-world systems and services.

Keywords

3G mobile communication; Internet telephony; code division multiple access; packet switching; quality of service; speech codecs; speech recognition; 3G WCDMA systems; AMR speech codec; DSR codec; VoIP transmission; adaptive multi-rate speech codec; distributed speech recognition framework; distributed speech-enabled services; packet-switched network; quality of service; speech recognition codecs; Adaptive systems; Audio-visual systems; Automatic speech recognition; Laboratories; Multiaccess communication; Partial response channels; Speech analysis; Speech codecs; Speech recognition; Target recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Print_ISBN

0-7803-7980-2

Type

conf

DOI

10.1109/ASRU.2003.1318510

Filename

1318510