Title :
Audio identification based on spectral modeling of bark-bands energy and synchronization through onset detection
Author :
Ramona, Mathieu ; Peeters, Geoffroy
Author_Institution :
Sound Anal./Synthesis Team, Ircam, Paris, France
Abstract :
In this paper, we present for the first time the fingerprint IRCAM system for audio identification in streams. The baseline system relies on a double-nested Short Time Fourier Transform. The first STFT computes the energies of a filter-bank, that are then modelled over 2 s, using a second STFT. We then present recent improvements of our system: first the inclusion of perceptual scales for amplitude and frequency (Bark bands), then the synchronization of stream and database frames using an onset detection system. The performance of these improvements is tested on a large set of real audio streams. We compare our results with the results of re-implementations of the two state-of-the-art systems of Philips and Shazam.
Keywords :
Fourier transforms; audio signal processing; channel bank filters; audio identification; bark-bands energy; baseline system; double-nested short time Fourier transform; filter-bank; fingerprint IRCAM system; onset detection system; spectral modeling; synchronization; Databases; Delay; Digital audio players; Encoding; Noise; Robustness; Synchronization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5946444