مرکز منطقه ای اطلاع رساني علوم و فناوري - Joint encoding of the waveform and speech recognition features using a transform codec

DocumentCode :

2178006

Title :

Joint encoding of the waveform and speech recognition features using a transform codec

Author :

Fan, Xing ; Seltzer, Michael L. ; Droppo, Jasha ; Malvar, Henrique S. ; Acero, Alex

Author_Institution :

Microsoft Res., Redmond, WA, USA

fYear :

2011

fDate :

22-27 May 2011

Firstpage :

5148

Lastpage :

5151

Abstract :

We propose a new transform speech codec that jointly encodes a wideband waveform and its corresponding wideband and narrowband speech recognition features. For distributed speech recognition, wideband features are compressed and transmitted as side information. The waveform is then encoded in a manner that exploits the information already captured by the speech features. Narrowband speech acoustic features can be synthesized at the server by applying a transformation to the decoded wideband features. An evaluation conducted on an in-car speech recognition task show that at 16 kbps our new system typically shows essentially no impact in word error rate compared to uncompressed audio, whereas the standard transform codec produces up to a 20% increase in word error rate. In addition, good quality speech is obtained for playback and transcription, with PESQ scores ranging from 3.2 to 3.4.

Keywords :

speech codecs; speech recognition; speech synthesis; PESQ; in-car speech recognition; joint encoding; narrowband speech recognition features; speech acoustic features task; speech codec; transform codec; wideband speech recognition features; Codecs; Encoding; Mel frequency cepstral coefficient; Narrowband; Speech; Speech recognition; Wideband; Siren codec; distributed speech recognition; speech coding; transform coding;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location :

Prague

ISSN :

1520-6149

Print_ISBN :

978-1-4577-0538-0

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2011.5947516

Filename :

5947516

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2178006