Title :
Estimation of handset nonlinearity with application to speaker recognition
Author :
Quatieri, Thomas F. ; Reynolds, Douglas A. ; O´Leary, Gerald C.
Author_Institution :
Lincoln Lab., MIT, Lexington, MA, USA
fDate :
9/1/2000 12:00:00 AM
Abstract :
A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. This “magnitude only” representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that are a potential source of degradation in speaker and speech recognition algorithms. As such, the method is particularly suited to algorithms that use only spectral magnitude information. The distortion model consists of a memoryless nonlinearity sandwiched between two finite-length linear filters. Nonlinearities considered include arbitrary finite-order polynomials and parametric sigmoidal functionals derived from a carbon-button handset model. Minimization of a mean-squared spectral magnitude distance with respect to model parameters relies on iterative estimation via a gradient descent technique. Initial work has demonstrated the importance of addressing handset nonlinearity, in addition to linear distortion, in speaker recognition over telephone channels. A nonlinear handset “mapping,” applied to training or testing data to reduce mismatch between different types of handset microphone outputs, improves speaker verification performance relative to linear compensation only. Finally, a method is proposed to merge the mapper strategy with a method of likelihood score normalization (hnorm) for further mismatch reduction and speaker verification performance improvement
Keywords :
digital filters; iterative methods; least mean squares methods; polynomials; spectral analysis; speech recognition; telephone sets; arbitrary finite-order polynomials; carbon-button handset model; degradation; distorted signal; distortion model; finite-length linear filters; gradient descent technique; handset nonlinearity; iterative estimation; likelihood score normalization; linear compensation; linear distortion; magnitude only representation; mapper strategy; mean-squared spectral magnitude distance; memoryless nonlinearity; mismatch reduction; nonlinear channel model; nonlinear channels; nonlinearities; parametric sigmoidal functionals; speaker recognition; speaker verification; speaker verification performance; spectral magnitude; spectral magnitude information; speech recognition algorithms; telephone handset nonlinearity; undistorted reference; unwanted speech formants; Degradation; Microphones; Nonlinear distortion; Nonlinear filters; Polynomials; Speaker recognition; Speech recognition; Telephone sets; Telephony; Testing;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on