Significance of Vowel-Like Regions for Speaker Verification Under Degraded Conditions

Author

Prasanna, S.R.M. ; Pradhan, G.

Author_Institution

Dept. of Electron. & Electr. Eng., Indian Inst. of Technol. Guwahati, Guwahati, India

Volume

19

Issue

8

fYear

2011

Firstpage

2552

Lastpage

2565

Abstract

Vowel-like regions (VLRs) in speech includes vowels, semi-vowels, and diphthong sound units. VLR can be identified using a vowel-like region onset point (VLROP) event. By production, the VLR has impulse-like excitation and therefore information about the vocal tract system may be better manifested in them. Also, the VLR is a relatively high signal-to-noise ratio (SNR) region. Speaker information extracted from such a region may therefore be more speaker discriminative and relatively less affected by the degradations like noise, reverberation, and sensor mismatches. Due to this, better speaker modeling and reliable testing may be possible. In this paper, VLRs are detected using the knowledge of VLROPs during training and testing. Features from the VLRs are then used for training and testing the speaker models. As a result, significant improvement in the performance is reported for speaker verification under degraded conditions.

Keywords

reverberation; speaker recognition; VLR; degradation like noise; diphthong sound unit; impulse-like excitation; reverberation; sensor mismatch; signal-to-noise ratio; speaker information; speaker model testing; speaker model training; speaker verification; vocal tract system; vowel-like region onset point; Degradation; Noise; Reverberation; Signal to noise ratio; Speaker recognition; Testing; Training; Degraded condition; speaker information; speaker verification (SV); vowel-like region (VLR); vowel-like region onset point;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TASL.2011.2155061

Filename

5767548