Perceptual long-term harmonic plus noise modeling for speech data compression

Author

Faten Ben Ali;Sonia Djaziri-Larbi

Author_Institution

Universit? de Tunis El Manar, Ecole Nationale d´Ing?nieurs de Tunis, Signals and Systems Lab, BP37, 1002 Le Belv?d?re, Tunis(ia)

fYear

2015

Firstpage

1372

Lastpage

1376

Abstract

The harmonic plus noise model (HNM) is widely used for the modeling of audio signals. In this paper, we introduce perceptual frequency masking to the 2-band HNM, developed by Stylianou et al., applied to speech signals. An auditory model is used to recognize inaudible sinusoids, which will be removed from the set of model´s parameters in order to reduce the data size for speech coding. The proposed perceptual HNM was applied to a large speech database from TIMIT and HINT and has proved to achieve an important (up to 50% in short term frames) parameters-rate compression, yielding a significant data-rates reduction for the long-term (LT) HNM model. The latter is based on LT trajectory modeling of the Short-Term (ST) HNM parameters. Objective and subjective quality evaluation shows that the perceptual HNM introduces no additional distortion compared to the generic 2-band HNM.

Keywords

"Harmonic analysis","Speech","Databases","Biological system modeling","Masking threshold","Data compression"

Publisher

ieee

Conference_Titel

Signal and Information Processing (GlobalSIP), 2015 IEEE Global Conference on

Type

conf

DOI

10.1109/GlobalSIP.2015.7418423

Filename

7418423

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3754257