DocumentCode :
3434427
Title :
Two-stage speech/music classifier with decision smoothing and sharpening in the EVS codec
Author :
Malenovsky, Vladimir ; Vaillancourt, Tommy ; Wang Zhe ; Choo, Kihyun ; Atti, Venkatraman
Author_Institution :
VoiceAge Corp., Montreal, QC, Canada
fYear :
2015
fDate :
19-24 April 2015
Firstpage :
5718
Lastpage :
5722
Abstract :
In most internationally recognized standardized multi-mode codecs, signal classification is performed in a single step by either linear discrimination or SNR-based metrics. The speech/music classifier of the EVS codec achieves greater discrimination than these single-step models by combining Gaussian mixture modelling (GMM) with a series of context-based improvement layers. Additionally, unlike traditional GMM classifiers the EVS model adopts a short hangover period, allowing it to track transitions between music and speech. Misclassifications are mitigated by applying a novel decision smoothing and sharpening technique. The results in relatively static environments demonstrate that the new two-stage approach with selective hangover leads to classification accuracies comparable to speech/music classifiers with longer hangovers. They also show that the new approach leads to faster and more accurate switching of coding modes than conventional classifiers for more complex audio environments such as advertisements, jingles and speech superimposed on music.
Keywords :
Gaussian processes; mixture models; signal classification; speech coding; EVS codec; Gaussian mixture modelling; decision sharpening; decision smoothing; linear discrimination; signal classification; two-stage speech/music classifier; Codecs; Databases; Multiple signal classification; Smoothing methods; Speech; Speech coding; EVS; GMM; sharpening; smoothing; speech/music classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
Type :
conf
DOI :
10.1109/ICASSP.2015.7179067
Filename :
7179067
Link To Document :
بازگشت