DocumentCode :
2109086
Title :
Approaches to Language Identification Using Gaussian Mixture Model and Linear Discriminant Analysis
Author :
Zeng, Xiuhua ; Yang, Jian ; Xu, Dan
Author_Institution :
Sch. of Inf. Sci. & Eng., Yunnan Univ., Kunming
fYear :
2008
fDate :
21-22 Dec. 2008
Firstpage :
1109
Lastpage :
1112
Abstract :
The baseline system PRLM has the best performance on NIST language recognition evaluation tasks. But this system needs orthographically or phonetically transcribed utterances which can not be easily obtained from Chinese dialects and minority languages. So, the PRLM system is not used to these languages. To overcome this limitation, we present the Gaussian mixture model recognizer followed by language-dependent language model (GMM-LM) as an approach to language identification. In this paper, we focus on finding the optimum number of frames to train each GMM parameter and comparing two back-end processing approaches in GMM-LM system. The experiments show that the LDA processing approach can achieve average accuracy 78%, which is a 45% relative improvement over simple approach on 30s test data.
Keywords :
Gaussian processes; natural language processing; Chinese dialects; Gaussian mixture model; NIST language recognition evaluation tasks; back-end processing approach; language identification; language-dependent language model; linear discriminant analysis; minority languages; Feature extraction; Information retrieval; Information science; Information security; Information technology; Linear discriminant analysis; NIST; National security; Natural languages; Testing; GMM-LM; LDA; language identification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Information Technology Application Workshops, 2008. IITAW '08. International Symposium on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3505-0
Type :
conf
DOI :
10.1109/IITA.Workshops.2008.212
Filename :
4732132
Link To Document :
بازگشت