Voicing Class Dependent Huffman Coding of Compressed Front-End Feature Vector for Distributed Speech Recognition

Author

Kim, Deok Su ; Kim, Hong Kook

Author_Institution

Dept. of Inf. & Commun., Gwangju Inst. of Sci. & Technol., Gwangju

Volume

3

fYear

2008

fDate

13-15 Dec. 2008

Firstpage

51

Lastpage

54

Abstract

In this paper, we propose an entropy coding method to further compress quantized mel-frequency cepstral coefficients (MFCCs) extracted for distributed speech recognition (DSR). In the ETSI extended DSR standard, MFCCs are compressed with additional parameters such as pitch and voicing class. It is observed that the distribution of MFCCs varies according to the voicing class, thereby enabling the design of different Huffman trees for MFCCs according to voicing class. Based on this observation, we could further reduce the bit-rates of compressed MFCCs compared to the Huffman coding method that does not consider voicing class. Subsequent experiments show that the bit-rate of the proposed method is 34.18 bits per frame, which is 1.84 bits/frame lower than that of the Huffman coding method without voicing.

Keywords

Huffman codes; cepstral analysis; data compression; entropy codes; feature extraction; speech coding; speech recognition; trees (mathematics); Huffman tree; compressed front-end feature vector; distributed speech recognition; entropy coding; quantized mel-frequency cepstral coefficient; voicing class dependent Huffman coding; Cepstral analysis; Data mining; Discrete cosine transforms; Entropy coding; Feature extraction; Huffman coding; Mel frequency cepstral coefficient; Quantization; Speech recognition; Telecommunication standards; DSR; Huffman coding; MFCC;

fLanguage

English

Publisher

ieee

Conference_Titel

Future Generation Communication and Networking Symposia, 2008. FGCNS '08. Second International Conference on

Conference_Location

Sanya

Print_ISBN

978-1-4244-3430-5

Electronic_ISBN

978-0-7695-3546-3

Type

conf

DOI

10.1109/FGCNS.2008.44

Filename

4813546