Title :
Novel Features for Effective Speech and Music Discrimination
Author :
Mubarak, Omer Mohsin ; Ambikairajah, Eliathamby ; Epps, Julien
Author_Institution :
Sch. of Electr. Eng. & Telecommun., New South Wales Univ., Sydney, NSW
Abstract :
Speech and music discrimination has gained much popularity in recent years for efficient coding and automatic retrieval of multimedia sources and automated speech recognition (ASR). Two novel features that can be concatenated with Mel frequency cepstral coefficients are presented in this paper: delta cepstral energy (DCE) and power spectrum deviation (PSDev). Employing a Gaussian mixture model for classification as a back-end to the system, a significant improvement in the error rate was found using these features. The effects of different musical instruments on error rates were also analyzed. Low frequency musical instruments like piano and electric bass guitar were found to be more difficult to discriminate from speech, however, the proposed features are also able to reduce such errors significantly
Keywords :
Gaussian processes; cepstral analysis; multimedia systems; music; musical instruments; signal classification; speech coding; speech recognition; Gaussian mixture model; Mel frequency cepstral coefficient; audio indexing; automated speech recognition; delta cepstral energy; multimedia coding; multimedia retrieval; music discrimination; musical instruments; power spectrum deviation; speech discrimination; Automatic speech recognition; Cepstral analysis; Concatenated codes; Error analysis; Instruments; Mel frequency cepstral coefficient; Music information retrieval; Power system modeling; Speech coding; Speech recognition; Gaussian mixture models; Mel frequency cepstral coefficients; audio indexing; multimedia coding; speech recognition;
Conference_Titel :
Engineering of Intelligent Systems, 2006 IEEE International Conference on
Conference_Location :
Islamabad
Print_ISBN :
1-4244-0456-8
DOI :
10.1109/ICEIS.2006.1703190