• DocumentCode
    2240807
  • Title

    Gaussian Mixture Modeling Using Short Time Fourier Transform Features for Audio Fingerprinting

  • Author

    Ramalingam, Arunan ; Krishnan, Sridhar

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Ryerson Univ., Toronto, Ont.
  • fYear
    2005
  • fDate
    6-6 July 2005
  • Firstpage
    1146
  • Lastpage
    1149
  • Abstract
    In audio fingerprinting, an audio clip must be recognized by matching an extracted fingerprint to a database of previously computed fingerprints. The fingerprints should reduce the dimensionality of the input significantly, provide discrimination among different audio clips, and at the same time, invariant to the distorted versions of the same audio clip. In this paper, we design fingerprints addressing the above issues by modeling an audio clip by Gaussian mixture models (GMM) using a wide range of easy-to-compute short time Fourier transform features such as Shannon entropy, Renyi entropy, spectral centroid, spectral bandwidth, spectral flatness measure, spectral crest factor, and Mel-frequency cepstral coefficients. We test the robustness of the fingerprints under a large number of distortions. To make the system robust, we use some of the distorted versions of the audio for training. However, we show that the audio fingerprints modeled using GMM are not only robust to the distortions used in training but also to distortions not used in training. Using spectral centroid as feature, we obtain the highest identification rate of 99.1% with a false positive rate of 10 -4
  • Keywords
    Fourier transforms; Gaussian processes; audio signal processing; feature extraction; fingerprint identification; speech recognition; Fourier transform; GMM; Gaussian mixture model; audio clip recognition; audio fingerprinting; feature extraction; spectral centroid; training phase; Audio databases; Bandwidth; Cepstral analysis; Distortion measurement; Entropy; Fingerprint recognition; Fourier transforms; Robustness; Spatial databases; Time measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on
  • Conference_Location
    Amsterdam
  • Print_ISBN
    0-7803-9331-7
  • Type

    conf

  • DOI
    10.1109/ICME.2005.1521629
  • Filename
    1521629