Title :
Speech/music discrimination for multimedia applications
Author :
El-Maleh, Khaled ; Klein, Mark ; Petrucci, Grace ; Kabal, Peter
Author_Institution :
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, Que., Canada
Abstract :
Automatic discrimination of speech and music is an important tool in many multimedia applications. Previous work has focused on using long-term features such as differential parameters, variances and time-averages of spectral parameters. These classifiers use features estimated over windows of 0.5-5 seconds, and are relatively complex. We present our results of combining the line spectral frequencies (LSFs) and zero crossing-based features for frame-level narrowband speech/music discrimination. Our classification results for different types of music and speech show the good discriminating power of these features. Our classification algorithms operate using only a frame delay of 20 ms, making them suitable for real-time multimedia applications
Keywords :
classification; multimedia systems; music; real-time systems; speech recognition; classification algorithms; classifiers; differential parameters; frame delay; frame-level narrowband speech music discrimination; line spectral frequencies; real time multimedia applications; real-time multimedia applications; spectral parameter variance; speech-music discrimination; zero crossing-based features; Application software; Automatic speech recognition; Content based retrieval; Delay; Multiple signal classification; Music information retrieval; Narrowband; Signal design; Speech coding; Streaming media;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.859336