DocumentCode :
395206
Title :
Low bit-rate feature vector compression using transform coding and non-uniform bit allocation
Author :
Milner, Ben ; Shao, Xu
Author_Institution :
Sch. of Inf. Syst., East Anglia Univ., Norwich, UK
Volume :
2
fYear :
2003
fDate :
6-10 April 2003
Abstract :
The paper presents a novel method for the low bit-rate compression of a feature vector stream with particular application to distributed speech recognition. The scheme operates by grouping feature vectors into non-overlapping blocks and applying a transformation to give a more compact matrix representation. Both Karhunen-Loeve and discrete cosine transforms are considered. Following transformation, higher-order columns of the matrix can be removed without loss in recognition performance. The number of bits allocated to the remaining elements in the matrix is determined automatically using a measure of their relative information content. Analysis of the amplitude distribution of the elements indicates that non-linear quantisation is more appropriate than linear quantisation. Comparative results, based on both spectral distortion and speech recognition accuracy, confirm this. Speech recognition tests using the ETSI Aurora database demonstrate that compression to bits rates of 2400 bps, 1200 bps and 800 bps has very little effect on recognition accuracy. For example at a bit rate of 1200 bps, recognition accuracy is 98.0% compared to 98.6% with no compression.
Keywords :
Karhunen-Loeve transforms; data compression; discrete cosine transforms; distortion; matrix algebra; quantisation (signal); speech coding; speech recognition; transform coding; 800 to 2400 bit/s; ETSI Aurora database; Karhunen-Loeve transforms; discrete cosine transforms; distributed speech recognition; feature vector compression; linear quantisation; low bit-rate compression; matrix; matrix representation; nonlinear quantisation; nonuniform bit allocation; spectral distortion; Bit rate; Discrete cosine transforms; Distortion measurement; Nonlinear distortion; Performance loss; Quantization; Speech recognition; Telecommunication standards; Testing; Transform coding;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1202311
Filename :
1202311
Link To Document :
بازگشت