• DocumentCode
    2727456
  • Title

    Speaker recognition using artificial neural networks based on vowel phonemes

  • Author

    Badran, Ehab F M F ; Selim, Hany

  • Author_Institution
    Dept. of Electr. Eng., Assiut Univ., Egypt
  • Volume
    2
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    796
  • Abstract
    Speaker recognition systems attempt to recognize a speaker by his/her voice through measurements of the specifically individual characteristics arising in the speaker´s voice. Among transformations of LPC parameters the adaptive component weighted (ACW) cepstrum has been shown to be less susceptible to channel effects than others. Text-independent and text-dependent speaker recognition systems suitable for verification and identification (open set and closed set) are presented, The system is based on locating the vowel phonemes of the test utterance. A preprocessing is applied to the speech signal. The centers of the vowel phonemes are located and identified as speech events using a three-step vowel phoneme locating process. The steps of the locating process are: (1) average magnitude function calculation; (2) vowel phoneme candidates location; and (3) ripple rejection. For each vowel phoneme (20 ms) 10 ACW cepstrum coefficients are calculated and are used as inputs to neural networks and the outputs are accumulated and averaged. The system hardware requirements are a microphone and a round card. The system software written in C++ language for windows. The system was tested with a population of 10 speakers (7 male and 3 female), and the statistics were taken (95.67% for text-dependent verification, 93% for text-dependent identification, 92.2% for text-independent verification and 88.95% for text-independent identification). There tests were done with utterances of one word having one vowel phoneme (20 msec used for recognizing the speaker). A vowel phoneme recognition application is also presented. A limited vocabulary recognition system is developed using vowel phoneme in the limited vocabulary. The feature vectors calculation is the same as in the speaker recognition system the only difference is in the neural network training and size (97.5% of word recognition)
  • Keywords
    adaptive signal processing; cepstral analysis; linear predictive coding; neural nets; speaker recognition; speech coding; ACW cepstrum coefficients; C++ language; LPC parameters; adaptive component weighted cepstrum; artificial neural networks; average magnitude function; closed set; microphone; neural network size; neural network training; neural networks; open set; preprocessing; ripple rejection; round card; speech signal; statistics; system hardware requirements; system software; test utterance; text-dependent identification; text-dependent speaker recognition systems; text-dependent verification; text-independent identification; text-independent speaker recognition systems; text-independent verification; three-step vowel phoneme; vowel phoneme candidates location; vowel phoneme recognition; Artificial neural networks; Cepstrum; Character recognition; Linear predictive coding; Neural networks; Speaker recognition; Speech processing; Speech recognition; System testing; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Proceedings, 2000. WCCC-ICSP 2000. 5th International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    0-7803-5747-7
  • Type

    conf

  • DOI
    10.1109/ICOSP.2000.891631
  • Filename
    891631