Title :
Audio indexing using feature warping and fusion techniques
Author :
Sénac, Christine ; Ambikairajah, Eliathamby
Author_Institution :
Institut de Recherche en Informatique de Toulouse, CNRS INP UPS, Toulouse, France
fDate :
29 Sept.-1 Oct. 2004
Abstract :
This paper reports on the improvement of speech and music indexation performance under various noisy conditions for radio broadcast using warped features fused with traditional features at the output stage. The system employs a bank of four parallel front ends followed by a classification in speech and music by Gaussian mixture models, where each front end employs a different feature extraction technique. Then an automatic gathering in macro classes is made. Indexing was performed on 8 hours of manually labelled radio broadcast from multilingual Radio France International recordings containing diverse speech and music content with different speaking styles, speakers, noise conditions and channels. For speech signal classification under the noisiest conditions, the warped features fused with traditional features produced an error rate three times smaller than that of either the warped features or the traditional features alone. Significant improvements were also found for speech classification under less noisy conditions.
Keywords :
Gaussian processes; audio signal processing; feature extraction; indexing; music; noise; radio broadcasting; speech processing; Gaussian mixture model; audio indexing; feature extraction technique; feature warping; fusion technique; macro class; multilingual Radio France International recording; parallel front end; radio broadcast; speech signal classification; Feature extraction; Indexing; Indium phosphide; Music; Pattern classification; Radio broadcasting; Speech enhancement; Speech processing; Telephony; Uninterruptible power systems;
Conference_Titel :
Multimedia Signal Processing, 2004 IEEE 6th Workshop on
Print_ISBN :
0-7803-8578-0
DOI :
10.1109/MMSP.2004.1436567