مرکز منطقه ای اطلاع رساني علوم و فناوري - Estimation of handset nonlinearity with application to speaker recognition

DocumentCode :

1371794

Title :

Estimation of handset nonlinearity with application to speaker recognition

Author :

Quatieri, Thomas F. ; Reynolds, Douglas A. ; O´Leary, Gerald C.

Author_Institution :

Lincoln Lab., MIT, Lexington, MA, USA

Volume :

Issue :

fYear :

2000

fDate :

9/1/2000 12:00:00 AM

Firstpage :

567

Lastpage :

584

Abstract :

A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. This “magnitude only” representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that are a potential source of degradation in speaker and speech recognition algorithms. As such, the method is particularly suited to algorithms that use only spectral magnitude information. The distortion model consists of a memoryless nonlinearity sandwiched between two finite-length linear filters. Nonlinearities considered include arbitrary finite-order polynomials and parametric sigmoidal functionals derived from a carbon-button handset model. Minimization of a mean-squared spectral magnitude distance with respect to model parameters relies on iterative estimation via a gradient descent technique. Initial work has demonstrated the importance of addressing handset nonlinearity, in addition to linear distortion, in speaker recognition over telephone channels. A nonlinear handset “mapping,” applied to training or testing data to reduce mismatch between different types of handset microphone outputs, improves speaker verification performance relative to linear compensation only. Finally, a method is proposed to merge the mapper strategy with a method of likelihood score normalization (hnorm) for further mismatch reduction and speaker verification performance improvement

Keywords :

digital filters; iterative methods; least mean squares methods; polynomials; spectral analysis; speech recognition; telephone sets; arbitrary finite-order polynomials; carbon-button handset model; degradation; distorted signal; distortion model; finite-length linear filters; gradient descent technique; handset nonlinearity; iterative estimation; likelihood score normalization; linear compensation; linear distortion; magnitude only representation; mapper strategy; mean-squared spectral magnitude distance; memoryless nonlinearity; mismatch reduction; nonlinear channel model; nonlinear channels; nonlinearities; parametric sigmoidal functionals; speaker recognition; speaker verification; speaker verification performance; spectral magnitude; spectral magnitude information; speech recognition algorithms; telephone handset nonlinearity; undistorted reference; unwanted speech formants; Degradation; Microphones; Nonlinear distortion; Nonlinear filters; Polynomials; Speaker recognition; Speech recognition; Telephone sets; Telephony; Testing;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/89.861376

Filename :

861376

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1371794