DocumentCode
3754257
Title
Perceptual long-term harmonic plus noise modeling for speech data compression
Author
Faten Ben Ali;Sonia Djaziri-Larbi
Author_Institution
Universit? de Tunis El Manar, Ecole Nationale d´Ing?nieurs de Tunis, Signals and Systems Lab, BP37, 1002 Le Belv?d?re, Tunis(ia)
fYear
2015
Firstpage
1372
Lastpage
1376
Abstract
The harmonic plus noise model (HNM) is widely used for the modeling of audio signals. In this paper, we introduce perceptual frequency masking to the 2-band HNM, developed by Stylianou et al., applied to speech signals. An auditory model is used to recognize inaudible sinusoids, which will be removed from the set of model´s parameters in order to reduce the data size for speech coding. The proposed perceptual HNM was applied to a large speech database from TIMIT and HINT and has proved to achieve an important (up to 50% in short term frames) parameters-rate compression, yielding a significant data-rates reduction for the long-term (LT) HNM model. The latter is based on LT trajectory modeling of the Short-Term (ST) HNM parameters. Objective and subjective quality evaluation shows that the perceptual HNM introduces no additional distortion compared to the generic 2-band HNM.
Keywords
"Harmonic analysis","Speech","Databases","Biological system modeling","Masking threshold","Data compression"
Publisher
ieee
Conference_Titel
Signal and Information Processing (GlobalSIP), 2015 IEEE Global Conference on
Type
conf
DOI
10.1109/GlobalSIP.2015.7418423
Filename
7418423
Link To Document