مرکز منطقه ای اطلاع رساني علوم و فناوري - Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system

DocumentCode :

940040

Title :

Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system

Author :

Yegnanarayana, B. ; Prasanna, S. R Mahadeva ; Zachariah, Jinu Mariam ; Gupta, Cheedella S.

Author_Institution :

Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Madras, Chennai, India

Volume :

Issue :

fYear :

2005

fDate :

7/1/2005 12:00:00 AM

Firstpage :

575

Lastpage :

582

Abstract :

This paper proposes a text-dependent (fixed-text) speaker verification system which uses different types of information for making a decision regarding the identity claim of a speaker. The baseline system uses the dynamic time warping (DTW) technique for matching. Detection of the end-points of an utterance is crucial for the performance of the DTW-based template matching. A method based on the vowel onset point (VOP) is proposed for locating the end-points of an utterance. The proposed method for speaker verification uses the suprasegmental and source features, besides spectral features. The suprasegmental features such as pitch and duration are extracted using the warping path information in the DTW algorithm. Features of the excitation source, extracted using the neural network models, are also used in the text-dependent speaker verification system. Although the suprasegmental and source features individually may not yield good performance, combining the evidence from these features seem to improve the performance of the system significantly. Neural network models are used to combine the evidence from multiple sources of information.

Keywords :

neural nets; speaker recognition; dynamic time warping technique; fixed-text speaker verification system; neural network models; source feature; spectral features; suprasegmental feature; Computer science; Data mining; Information resources; Information systems; Natural languages; Neural networks; Shape; Speaker recognition; Speech recognition; System testing; Dynamic time warping; source features; speaker verification; spectral features; suprasegmental features; vowel onset point;

fLanguage :

English

Journal_Title :

Speech and Audio Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1063-6676

Type :

jour

DOI :

10.1109/TSA.2005.848892

Filename :

1453600

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=940040