Simple and artefact-free spectral modifications for enhancing the intelligibility of casual speech

Author

Koutsogiannaki, Maria ; Stylianou, Yannis

Author_Institution

Comput. Sci. Dept., Univ. of Crete, Heraklion, Greece

fYear

2014

fDate

4-9 May 2014

Firstpage

4648

Lastpage

4652

Abstract

In this paper, the problem of modifying casual speech to reach the intelligibility level of clear speech is addressed. Unlike other studies, in this work modifications on casual speech both consider intelligibility and speech quality. To achieve this, the authors focus on human-like modifications inspired by clear speech. An acoustic analysis performed on clear and casual speech reveals energy differences on specific frequency bands between the two speaking styles. Then, a simple method is used to boost these frequency regions on casual speech. The proposed method, called mix-filtering, uses a multi-band filtering scheme to isolate the information of these frequency bands and then, add this information to the original signal. Our method is compared in terms of intelligibility and quality with unmodified casual speech and with a highly intelligible spectral modification technique, namely the Spectral Shaping and Dynamic Range Compression (SSDRC). Two different objective measures that are highly correlated with subjective intelligibility scores are used for estimating the intelligibility, whereas for evaluating the quality, preference listening tests are performed. Results show that the mix-filtering technique increases the intelligibility of casual speech while maintains its quality. On the other hand, while SSDRC outperforms on intelligibility, it degrades significantly the quality of casual speech.

Keywords

filtering theory; speech coding; speech enhancement; SSDRC; acoustic analysis; artefact-free spectral modifications; casual speech intelligibility enhancement; clear speech; mix-filtering technique; multiband filtering scheme; spectral shaping and dynamic range compression; speech quality; Auditory system; Signal to noise ratio; Speech; Speech enhancement; System-on-chip; Casual speech; Clear speech; Intelligibility; Spectral modifications; Speech quality;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854483

Filename

6854483