Title :
A K-phoneme-class based multi-model method for short utterance speaker recognition
Author :
Chenhao Zhang ; Xiaojun Wu ; Zheng, Thomas Fang ; Linlin Wang ; Cong Yin
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Abstract :
For GMM-UBM based text-independent speaker recognition, the performance decreases significantly when the test speech is too short. Considering that the use of text information is helpful, a K-phoneme-class scoring based multiple phoneme class speaker model method (shortened as K-phoneme-class based multi-model method, abbreviated as KPCMMM) is proposed including a phoneme class speech recognition stage and a phoneme class dependent multi-model speaker recognition stage, where K means the number of most likely phoneme classes to be used in the second stage. Two different phoneme class definitions, expert-knowledge based and data-driven, are compared, and the performance as a function of K is also studied. Experimental results show that the data-driven phoneme class definition outperforms the expert-knowledge based one, and that an appropriate K value can lead to much better performance. Compared with the baseline GMM-UBM system, the proposed KPCMMM can achieve a relative equal error rate (EER) reduction of 38.60% for text-independent speaker recognition with a length of less than 2 seconds of test speech.
Keywords :
Gaussian processes; speaker recognition; GMM-UBM; K-phoneme-class scoring; data-driven phoneme class definition; multimodel method; phoneme class dependent multimodel speaker recognition stage; phoneme class speaker model method; phoneme class speech recognition stage; relative equal error rate reduction; short utterance speaker recognition; text-independent speaker recognition; Data models; Mathematical model; Speaker recognition; Speech; Speech recognition; Training; Training data;
Conference_Titel :
Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific
Conference_Location :
Hollywood, CA
Print_ISBN :
978-1-4673-4863-8