Feature extraction with convolutional restricted boltzmann machine for audio classification

Author

Min Li;Zhenjiang Miao;Cong Ma

Author_Institution

Institute of Information Science, Beijing Jiaotong University, China

fYear

2015

Firstpage

791

Lastpage

795

Abstract

Feature extraction is a crucial part for a large number of audio tasks. Researchers have extracted audio features in multiple ways, among which some most recent methods are based on the hidden layer of a trained neutral network. In this paper, we present a system which can automatically extract features from unlabeled audio data, and then the features of extracted from the system are used for audio classification task. Ourfeature extraction scheme makes use of a convolutional restricted Boltzmann machine (CRBM), instead of those using restricted Boltzmann machines (RB-M). By using features extracted from CRBM, we can achieve about 7% accuracy improvement consistently over than the RBM-based features on the TI-Digits dataset for audio classification. We also combine the well-known MFCC features and the CRBM-based features in the form of a linear combination. In our experiments, this feature combining the two methods performs better than both features alone.

Keywords

"Feature extraction","Training","Mel frequency cepstral coefficient","Mathematical model","Support vector machines","Data mining","Training data"

Publisher

ieee

Conference_Titel

Pattern Recognition (ACPR), 2015 3rd IAPR Asian Conference on

Electronic_ISBN

2327-0985

Type

conf

DOI

10.1109/ACPR.2015.7486611

Filename

7486611