Title :
A construction of compact MFCC-type features using short-time statistics for applications in audio segmentation
Author :
von Zeddelmann, Dirk ; Kurth, Frank
Author_Institution :
Res. Inst. for Commun., Inf. Process. & Ergonomics (FKIE), Res. Establ. for Appl. Sci. (FGAN), Wachtberg, Germany
Abstract :
In this paper, we propose a new class of audio feature that is derived from the well-known mel frequency cepstral coefficients (MFCCs) which are widely used in speech processing. More precisely, we calculate suitable short-time statistics during the MFCC computation to obtain smoothed features with a temporal resolution that may be adjusted depending on the application. The approach was motivated by the task of audio segmentation where the classical MFCCs, having a fine temporal resolution, may result in a high amount of fluctuations and, consequently, an unstable segmentation. As a main contribution, our proposed MFCC-ENS (MFCC-Energy Normalized Statistics) features may be adapted to have a lower, and more suitable, temporal resolution while summarizing the essential information contained in the MFCCs. Our experiments on the segmentation of radio programmes demonstrate the benefits of the newly proposed features.
Keywords :
cepstral analysis; smoothing methods; speech processing; statistical analysis; MFCC-ENS; audio feature; audio segmentation; compact MFCC-type features; energy normalized statistics; fine temporal resolution; mel frequency cepstral coefficients; radio programmes; short-time statistics; smoothed features; speech processing; Feature extraction; Mel frequency cepstral coefficient; Smoothing methods; Speech; Speech processing; Training; Vectors;
Conference_Titel :
Signal Processing Conference, 2009 17th European
Conference_Location :
Glasgow
Print_ISBN :
978-161-7388-76-7