• DocumentCode
    592089
  • Title

    Using Wavelets and Gaussian Mixture Models for Audio Classification

  • Author

    Ching-Hua Chuan ; Vasana, S. ; Asaithambi, Asai

  • Author_Institution
    Sch. of Comput., Univ. of North Florida, Jacksonville, FL, USA
  • fYear
    2012
  • fDate
    10-12 Dec. 2012
  • Firstpage
    421
  • Lastpage
    426
  • Abstract
    In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.
  • Keywords
    Gaussian processes; audio recording; audio signal processing; discrete wavelet transforms; expectation-maximisation algorithm; feature extraction; music; Gaussian mixture model; audio classification; audio recordings; compact vector representation; discrete wavelet transform; expectation maximization algorithm; feature extraction; low-level acoustic features; male/female speech classification; multiple-level decomposition; music genre classification; sound classes; speech/music classification; Feature extraction; Mathematical model; Speech; Vectors; Wavelet analysis; Wavelet transforms; Gaussian Mixture Models; Wavelets; audio classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia (ISM), 2012 IEEE International Symposium on
  • Conference_Location
    Irvine, CA
  • Print_ISBN
    978-1-4673-4370-1
  • Type

    conf

  • DOI
    10.1109/ISM.2012.86
  • Filename
    6424700