DocumentCode :
3413107
Title :
Language detection in audio content analysis
Author :
Mitra, Vikramjit ; Garcia-Romero, Daniel ; Espy-Wilson, Carol Y.
Author_Institution :
Dept. of Electr. & Comput. Eng., Maryland Univ., College Park, MD
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
2109
Lastpage :
2112
Abstract :
Experiments have shown that Language Identification systems for telephonic speech using shifted delta cepstra as the feature set and Gaussian mixture models as the backend, offers superior performance than other competing techniques. This paper aims to address the task of Language Identification for audio signals. The abundance of digital music from the Internet calls for a reliable real-time system for analyzing and properly categorizing them. Previous research has mainly focused on categorizing audio files into appropriate genres; however genre types vary with language. This paper proposes a systematic audio content analysis strategy by initially detecting whether an audio file has any vocals present in it and, if present, then detecting the language of the song. Given the language of the song, genre detection becomes a closed set classification problem.
Keywords :
audio signals; signal detection; Gaussian mixture models; audio content analysis; genre detection; language detection; shifted delta cepstra; telephonic speech; Cepstral analysis; Educational institutions; Instruments; Internet; Mel frequency cepstral coefficient; Natural languages; Performance analysis; Search engines; Speech analysis; Support vector machines; Audio Content Analysis; GMM-supervector; Gaussian Mixture Model; Language Identification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518058
Filename :
4518058
Link To Document :
بازگشت