DocumentCode
3413107
Title
Language detection in audio content analysis
Author
Mitra, Vikramjit ; Garcia-Romero, Daniel ; Espy-Wilson, Carol Y.
Author_Institution
Dept. of Electr. & Comput. Eng., Maryland Univ., College Park, MD
fYear
2008
fDate
March 31 2008-April 4 2008
Firstpage
2109
Lastpage
2112
Abstract
Experiments have shown that Language Identification systems for telephonic speech using shifted delta cepstra as the feature set and Gaussian mixture models as the backend, offers superior performance than other competing techniques. This paper aims to address the task of Language Identification for audio signals. The abundance of digital music from the Internet calls for a reliable real-time system for analyzing and properly categorizing them. Previous research has mainly focused on categorizing audio files into appropriate genres; however genre types vary with language. This paper proposes a systematic audio content analysis strategy by initially detecting whether an audio file has any vocals present in it and, if present, then detecting the language of the song. Given the language of the song, genre detection becomes a closed set classification problem.
Keywords
audio signals; signal detection; Gaussian mixture models; audio content analysis; genre detection; language detection; shifted delta cepstra; telephonic speech; Cepstral analysis; Educational institutions; Instruments; Internet; Mel frequency cepstral coefficient; Natural languages; Performance analysis; Search engines; Speech analysis; Support vector machines; Audio Content Analysis; GMM-supervector; Gaussian Mixture Model; Language Identification;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location
Las Vegas, NV
ISSN
1520-6149
Print_ISBN
978-1-4244-1483-3
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2008.4518058
Filename
4518058
Link To Document