Neural Network Ensemble Based on Vowel Classification for Chinese Speaker Recognition

Author

Qian, Bo ; Tang, Zhen-min ; Li, Yan-Ping ; Xu, Li-Min ; Zhang, Yan

Author_Institution

Nanjing Univ., Nanjing

Volume

3

fYear

2007

Firstpage

141

Lastpage

145

Abstract

As we known, features of speech signal not only reflect the identity information, but also contain the semantical information. In this paper, we describe a novel neural network ensemble architecture based on the finding that the diphthong and multi-vowel in Chinese can approximately be considered as the complex of mono- vowel and transitional part in the standpoint of short-term analysis. Several neural networks are trained, each for the eigenspace of one mono-vowel, and their results are combined by another combinational neural network. The architecture can effectively improve the recognition accuracy by eliminating the disturbance of semantical information. Experimental results show that the recognition accuracy of our proposed approach is higher than conventional methods such as a single neural network and other proposed ensemble structures.

Keywords

natural languages; neural nets; speaker recognition; Chinese speaker recognition; mono-vowel eigenspace; neural network ensemble; semantical information; semantical information disturbance; speech signal features; vowel classification; Computer architecture; Computer networks; Detection algorithms; Face recognition; Feature extraction; Mel frequency cepstral coefficient; Neural networks; Speaker recognition; Speech analysis; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Natural Computation, 2007. ICNC 2007. Third International Conference on

Conference_Location

Haikou

Print_ISBN

978-0-7695-2875-5

Type

conf

DOI

10.1109/ICNC.2007.495

Filename

4344494