DocumentCode :
1371794
Title :
Estimation of handset nonlinearity with application to speaker recognition
Author :
Quatieri, Thomas F. ; Reynolds, Douglas A. ; O´Leary, Gerald C.
Author_Institution :
Lincoln Lab., MIT, Lexington, MA, USA
Volume :
8
Issue :
5
fYear :
2000
fDate :
9/1/2000 12:00:00 AM
Firstpage :
567
Lastpage :
584
Abstract :
A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. This “magnitude only” representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that are a potential source of degradation in speaker and speech recognition algorithms. As such, the method is particularly suited to algorithms that use only spectral magnitude information. The distortion model consists of a memoryless nonlinearity sandwiched between two finite-length linear filters. Nonlinearities considered include arbitrary finite-order polynomials and parametric sigmoidal functionals derived from a carbon-button handset model. Minimization of a mean-squared spectral magnitude distance with respect to model parameters relies on iterative estimation via a gradient descent technique. Initial work has demonstrated the importance of addressing handset nonlinearity, in addition to linear distortion, in speaker recognition over telephone channels. A nonlinear handset “mapping,” applied to training or testing data to reduce mismatch between different types of handset microphone outputs, improves speaker verification performance relative to linear compensation only. Finally, a method is proposed to merge the mapper strategy with a method of likelihood score normalization (hnorm) for further mismatch reduction and speaker verification performance improvement
Keywords :
digital filters; iterative methods; least mean squares methods; polynomials; spectral analysis; speech recognition; telephone sets; arbitrary finite-order polynomials; carbon-button handset model; degradation; distorted signal; distortion model; finite-length linear filters; gradient descent technique; handset nonlinearity; iterative estimation; likelihood score normalization; linear compensation; linear distortion; magnitude only representation; mapper strategy; mean-squared spectral magnitude distance; memoryless nonlinearity; mismatch reduction; nonlinear channel model; nonlinear channels; nonlinearities; parametric sigmoidal functionals; speaker recognition; speaker verification; speaker verification performance; spectral magnitude; spectral magnitude information; speech recognition algorithms; telephone handset nonlinearity; undistorted reference; unwanted speech formants; Degradation; Microphones; Nonlinear distortion; Nonlinear filters; Polynomials; Speaker recognition; Speech recognition; Telephone sets; Telephony; Testing;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.861376
Filename :
861376
Link To Document :
بازگشت