Speaker recognition using artificial neural networks based on vowel phonemes

Author

Badran, Ehab F M F ; Selim, Hany

Author_Institution

Dept. of Electr. Eng., Assiut Univ., Egypt

Volume

fYear

2000

fDate

2000

Firstpage

796

Abstract

Speaker recognition systems attempt to recognize a speaker by his/her voice through measurements of the specifically individual characteristics arising in the speaker´s voice. Among transformations of LPC parameters the adaptive component weighted (ACW) cepstrum has been shown to be less susceptible to channel effects than others. Text-independent and text-dependent speaker recognition systems suitable for verification and identification (open set and closed set) are presented, The system is based on locating the vowel phonemes of the test utterance. A preprocessing is applied to the speech signal. The centers of the vowel phonemes are located and identified as speech events using a three-step vowel phoneme locating process. The steps of the locating process are: (1) average magnitude function calculation; (2) vowel phoneme candidates location; and (3) ripple rejection. For each vowel phoneme (20 ms) 10 ACW cepstrum coefficients are calculated and are used as inputs to neural networks and the outputs are accumulated and averaged. The system hardware requirements are a microphone and a round card. The system software written in C++ language for windows. The system was tested with a population of 10 speakers (7 male and 3 female), and the statistics were taken (95.67% for text-dependent verification, 93% for text-dependent identification, 92.2% for text-independent verification and 88.95% for text-independent identification). There tests were done with utterances of one word having one vowel phoneme (20 msec used for recognizing the speaker). A vowel phoneme recognition application is also presented. A limited vocabulary recognition system is developed using vowel phoneme in the limited vocabulary. The feature vectors calculation is the same as in the speaker recognition system the only difference is in the neural network training and size (97.5% of word recognition)

Keywords

adaptive signal processing; cepstral analysis; linear predictive coding; neural nets; speaker recognition; speech coding; ACW cepstrum coefficients; C++ language; LPC parameters; adaptive component weighted cepstrum; artificial neural networks; average magnitude function; closed set; microphone; neural network size; neural network training; neural networks; open set; preprocessing; ripple rejection; round card; speech signal; statistics; system hardware requirements; system software; test utterance; text-dependent identification; text-dependent speaker recognition systems; text-dependent verification; text-independent identification; text-independent speaker recognition systems; text-independent verification; three-step vowel phoneme; vowel phoneme candidates location; vowel phoneme recognition; Artificial neural networks; Cepstrum; Character recognition; Linear predictive coding; Neural networks; Speaker recognition; Speech processing; Speech recognition; System testing; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing Proceedings, 2000. WCCC-ICSP 2000. 5th International Conference on

Conference_Location

Beijing

Print_ISBN

0-7803-5747-7

Type

conf

DOI

10.1109/ICOSP.2000.891631

Filename

891631

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=2727456