Title :
Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system
Author :
Yegnanarayana, B. ; Prasanna, S. R Mahadeva ; Zachariah, Jinu Mariam ; Gupta, Cheedella S.
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Madras, Chennai, India
fDate :
7/1/2005 12:00:00 AM
Abstract :
This paper proposes a text-dependent (fixed-text) speaker verification system which uses different types of information for making a decision regarding the identity claim of a speaker. The baseline system uses the dynamic time warping (DTW) technique for matching. Detection of the end-points of an utterance is crucial for the performance of the DTW-based template matching. A method based on the vowel onset point (VOP) is proposed for locating the end-points of an utterance. The proposed method for speaker verification uses the suprasegmental and source features, besides spectral features. The suprasegmental features such as pitch and duration are extracted using the warping path information in the DTW algorithm. Features of the excitation source, extracted using the neural network models, are also used in the text-dependent speaker verification system. Although the suprasegmental and source features individually may not yield good performance, combining the evidence from these features seem to improve the performance of the system significantly. Neural network models are used to combine the evidence from multiple sources of information.
Keywords :
neural nets; speaker recognition; dynamic time warping technique; fixed-text speaker verification system; neural network models; source feature; spectral features; suprasegmental feature; Computer science; Data mining; Information resources; Information systems; Natural languages; Neural networks; Shape; Speaker recognition; Speech recognition; System testing; Dynamic time warping; source features; speaker verification; spectral features; suprasegmental features; vowel onset point;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
DOI :
10.1109/TSA.2005.848892