Title :
Gaussian Mixture Modeling Using Short Time Fourier Transform Features for Audio Fingerprinting
Author :
Ramalingam, Arunan ; Krishnan, Sridhar
Author_Institution :
Dept. of Electr. & Comput. Eng., Ryerson Univ., Toronto, Ont.
Abstract :
In audio fingerprinting, an audio clip must be recognized by matching an extracted fingerprint to a database of previously computed fingerprints. The fingerprints should reduce the dimensionality of the input significantly, provide discrimination among different audio clips, and at the same time, invariant to the distorted versions of the same audio clip. In this paper, we design fingerprints addressing the above issues by modeling an audio clip by Gaussian mixture models (GMM) using a wide range of easy-to-compute short time Fourier transform features such as Shannon entropy, Renyi entropy, spectral centroid, spectral bandwidth, spectral flatness measure, spectral crest factor, and Mel-frequency cepstral coefficients. We test the robustness of the fingerprints under a large number of distortions. To make the system robust, we use some of the distorted versions of the audio for training. However, we show that the audio fingerprints modeled using GMM are not only robust to the distortions used in training but also to distortions not used in training. Using spectral centroid as feature, we obtain the highest identification rate of 99.1% with a false positive rate of 10 -4
Keywords :
Fourier transforms; Gaussian processes; audio signal processing; feature extraction; fingerprint identification; speech recognition; Fourier transform; GMM; Gaussian mixture model; audio clip recognition; audio fingerprinting; feature extraction; spectral centroid; training phase; Audio databases; Bandwidth; Cepstral analysis; Distortion measurement; Entropy; Fingerprint recognition; Fourier transforms; Robustness; Spatial databases; Time measurement;
Conference_Titel :
Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on
Conference_Location :
Amsterdam
Print_ISBN :
0-7803-9331-7
DOI :
10.1109/ICME.2005.1521629