DocumentCode
592089
Title
Using Wavelets and Gaussian Mixture Models for Audio Classification
Author
Ching-Hua Chuan ; Vasana, S. ; Asaithambi, Asai
Author_Institution
Sch. of Comput., Univ. of North Florida, Jacksonville, FL, USA
fYear
2012
fDate
10-12 Dec. 2012
Firstpage
421
Lastpage
426
Abstract
In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.
Keywords
Gaussian processes; audio recording; audio signal processing; discrete wavelet transforms; expectation-maximisation algorithm; feature extraction; music; Gaussian mixture model; audio classification; audio recordings; compact vector representation; discrete wavelet transform; expectation maximization algorithm; feature extraction; low-level acoustic features; male/female speech classification; multiple-level decomposition; music genre classification; sound classes; speech/music classification; Feature extraction; Mathematical model; Speech; Vectors; Wavelet analysis; Wavelet transforms; Gaussian Mixture Models; Wavelets; audio classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia (ISM), 2012 IEEE International Symposium on
Conference_Location
Irvine, CA
Print_ISBN
978-1-4673-4370-1
Type
conf
DOI
10.1109/ISM.2012.86
Filename
6424700
Link To Document