Title :
IITKGP-MLILSC speech database for language identification
Author :
Maity, Sudhamay ; Vuppala, Anil Kumar ; Rao, K. Sreenivasa ; Nandi, Dipanjan
Author_Institution :
Sch. of Inf. Technol., Indian Inst. of Technol. Kharagpur, Kharagpur, India
Abstract :
In this paper, we are introducing speech database consists of 27 Indian languages for analyzing language specific information present in speech. In the context of Indian languages, systematic analysis of various speech features and classification models in view of automatic language identification has not performed, because of the lack of proper speech corpus covering majority of the Indian languages. With this motivation, we have initiated the task of developing multilingual speech corpus in Indian languages. In this paper spectral features are explored for investigating the presence of language specific information. Melfrequency cepstral coefficients (MFCCs) and linear predictive cepstral coefficients (LPCCs) are used for representing the spectral information. Gaussian mixture models (GMMs) are developed to capture the language specific information present in spectral features. The performance of language identification system is analyzed in view of speaker dependent and independent cases. The recognition performance is observed to be 96% and 45% respectively, for speaker dependent and independent environments.
Keywords :
Gaussian processes; cepstral analysis; feature extraction; natural language processing; pattern classification; speaker recognition; speech processing; Gaussian mixture models; IITKGP-MLILSC speech database; Indian languages; automatic language identification system; language specific information analysis; linear predictive cepstral coefficients; mel-frequency cepstral coefficients; multilingual speech corpus; speaker dependent environments; speaker recognition performance; spectral feature exploration; speech feature classification models; systematic speech feature analysis; Computational modeling; Databases; Mel frequency cepstral coefficient; Predictive models; Speech; Speech recognition; Gaussian mixture models (GMMs); Indian Language Database; Language Identification; Linear prediction cepstral coefficients (LPCCs); Mel-frequency cepstral coefficients (MFCCs);
Conference_Titel :
Communications (NCC), 2012 National Conference on
Conference_Location :
Kharagpur
Print_ISBN :
978-1-4673-0815-1
DOI :
10.1109/NCC.2012.6176831