DocumentCode :
2449910
Title :
Syllable category based short utterance speaker recognition
Author :
Fatima, Nakhat ; Zheng, Thomas Fang
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
fYear :
2012
fDate :
16-18 July 2012
Firstpage :
436
Lastpage :
441
Abstract :
In Short Utterance Speaker Recognition (SUSR), the role of complete speech units like syllables in carrying speaker information needs further investigation. This paper presents a novel method of using syllable categories for SUSR. We define Syllable Categories (SCs) with the help of syllable structure of Chinese language. Syllables in speech are segmented into SCs, which are then used to develop Universal Background SC Model for each SC. Conventional GMM-UBM system is used for training and testing. The proposed categories give average EER of 17.79%, 19.35% and 21.65% for 3, 2 and 1 second of test utterance length respectively. Experimental results show that in text dependent SUSR, significant speaker-specific information is present at syllable level where prosodic idiosyncrasies can be utilized. This information can be used in SUSR by exploiting similarities in consonants and vowels of a syllable such that SCs can be used effectively.
Keywords :
Gaussian processes; natural language processing; speaker recognition; text analysis; Chinese language syllable structure; EER; GMM-UBM system; prosodic idiosyncrasy; speaker-specific information; speech units; syllable category based short utterance speaker recognition; text dependent SUSR; universal background SC model; Feature extraction; Hidden Markov models; Liquids; Speaker recognition; Speech; Speech recognition; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Audio, Language and Image Processing (ICALIP), 2012 International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4673-0173-2
Type :
conf
DOI :
10.1109/ICALIP.2012.6376657
Filename :
6376657
Link To Document :
بازگشت