مرکز منطقه ای اطلاع رساني علوم و فناوري - Robust singing detection in speech/music discriminator design

DocumentCode :

3344116

Title :

Robust singing detection in speech/music discriminator design

Author :

Chou, Wu ; Gu, Liang

Author_Institution :

Lucent Technol. Bell Labs., Murray Hill, NJ, USA

Volume :

fYear :

2001

fDate :

2001

Firstpage :

865

Abstract :

In this paper, an approach for robust singing signal detection in speech/music discrimination is proposed and applied to applications of audio indexing. Conventional approaches in speech/music discrimination can provide reasonable performance with regular music signals but often perform poorly with singing segments. This is due mainly to the fact that speech and singing signals are extremely close and traditional features used in speech recognition do not provide a reliable cue for speech and singing signal discrimination. In order to improve the robustness of speech/music discrimination, a new set of features derived from the harmonic coefficient and its 4 Hz modulation values are developed in this paper, and these new features provide additional and reliable cues to separate speech from singing. In addition, a rule-based post-filtering scheme is also described which leads to further improvements in speech/music discrimination. Source-independent audio indexing experiments on the PBS Skills database indicate that the proposed approach can greatly reduce the classification error rate on singing segments in the audio stream. Comparing with existing approaches, the overall segmentation error rate is reduced by more than 30%, averaged over all shows in the database

Keywords :

audio signal processing; database indexing; music; speech processing; speech recognition; 4 Hz; audio indexing; audio stream; classification error rate; harmonic coefficient; modulation; music signals; robust singing detection; robustness; rule-based post-filtering scheme; segmentation error rate; signal discrimination; singing segments; speech/music discriminator design; Automatic speech recognition; Cepstral analysis; Error analysis; Indexing; Multiple signal classification; Robustness; Signal detection; Signal processing; Speech recognition; Streaming media;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on

Conference_Location :

Salt Lake City, UT

ISSN :

1520-6149

Print_ISBN :

0-7803-7041-4

Type :

conf

DOI :

10.1109/ICASSP.2001.941052

Filename :

941052

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3344116