Title :
Speech Enhancement for Listeners With Hearing Loss Based on a Model for Vowel Coding in the Auditory Midbrain
Author :
Rao, Akhila ; Carney, Laurel H.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Rochester, Rochester, NY, USA
Abstract :
A novel signal-processing strategy is proposed to enhance speech for listeners with hearing loss. The strategy focuses on improving vowel perception based on a recent hypothesis for vowel coding in the auditory system. Traditionally, studies of neural vowel encoding have focused on the representation of formants (peaks in vowel spectra) in the discharge patterns of the population of auditory-nerve (AN) fibers. A recent hypothesis focuses instead on vowel encoding in the auditory midbrain, and suggests a robust representation of formants. AN fiber discharge rates are characterized by pitch-related fluctuations having frequency-dependent modulation depths. Fibers tuned to frequencies near formants exhibit weaker pitch-related fluctuations than those tuned to frequencies between formants. Many auditory midbrain neurons show tuning to amplitude modulation frequency in addition to audio frequency. According to the auditory midbrain vowel encoding hypothesis, the response map of a population of midbrain neurons tuned to modulations near voice pitch exhibits minima near formant frequencies, due to the lack of strong pitch-related fluctuations at their inputs. This representation is robust over the range of noise conditions in which speech intelligibility is also robust for normal-hearing listeners. Based on this hypothesis, a vowel-enhancement strategy has been proposed that aims to restore vowel encoding at the level of the auditory midbrain. The signal processing consists of pitch tracking, formant tracking, and formant enhancement. The novel formant-tracking method proposed here estimates the first two formant frequencies by modeling characteristics of the auditory periphery, such as saturated discharge rates of AN fibers and modulation tuning properties of auditory midbrain neurons. The formant enhancement stage aims to restore the representation of formants at the level of the midbrain by increasing the dominance of a single harmonic near each formant and saturating - hat frequency channel. A MATLAB implementation of the system with low computational complexity was developed. Objective tests of the formant-tracking subsystem on vowels suggest that the method generalizes well over a wide range of speakers and vowels.
Keywords :
brain models; computational complexity; fluctuations; hearing; mathematics computing; medical disorders; medical signal processing; natural fibres; neural nets; neurophysiology; noise; signal restoration; speech coding; speech enhancement; speech intelligibility; tracking; tuning; AN fiber discharge rate characterization; AN fiber population discharge patterns; MATLAB implementation; amplitude modulation frequency; audio frequency; auditory midbrain neuron modulation tuning properties; auditory midbrain neuron tuning; auditory midbrain vowel encoding hypothesis; auditory midbrain vowel encoding restoration; auditory periphery characteristic modeling; auditory system; auditory-nerve fiber population; between formant frequencies; computational complexity; fiber tuning; formant enhancement; formant frequency estimation; formant representation restoration; formant tracking; formant-tracking subsystem; frequency channel saturation; frequency-dependent modulation depths; hearing loss listeners; midbrain neuron population response map; near formant frequencies; near voice pitch modulations; neural vowel encoding; noise condition range; normal-hearing listeners; pitch tracking; pitch-related fluctuations; robust speech intelligibility; saturated AN fiber discharge rates; signal processing strategy; single harmonic dominance; speech enhancement; vowel coding model; vowel perception improvement; vowel spectral peak representation; vowel-enhancement strategy; Auditory system; Encoding; Fluctuations; Frequency modulation; Neurons; Speech; Auditory models; formant detection; formant estimation; formant tracking; hearing aids; neural coding; speech analysis;
Journal_Title :
Biomedical Engineering, IEEE Transactions on
DOI :
10.1109/TBME.2014.2313618