مرکز منطقه ای اطلاع رساني علوم و فناوري - Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification

DocumentCode :

1336159

Title :

Non-Negative Multilinear Principal Component Analysis of Auditory Temporal Modulations for Music Genre Classification

Author :

Panagakis, Yannis ; Kotropoulos, Constantine ; Arce, Gonzalo R.

Author_Institution :

Dept. of Inf., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece

Volume :

Issue :

fYear :

2010

fDate :

3/1/2010 12:00:00 AM

Firstpage :

576

Lastpage :

588

Abstract :

Motivated by psychophysiological investigations on the human auditory system, a bio-inspired two-dimensional auditory representation of music signals is exploited, that captures the slow temporal modulations. Although each recording is represented by a second-order tensor (i.e., a matrix), a third-order tensor is needed to represent a music corpus. Non-negative multilinear principal component analysis (NMPCA) is proposed for the unsupervised dimensionality reduction of the third-order tensors. The NMPCA maximizes the total tensor scatter while preserving the non-negativity of auditory representations. An algorithm for NMPCA is derived by exploiting the structure of the Grassmann manifold. The NMPCA is compared against three multilinear subspace analysis techniques, namely the non-negative tensor factorization, the high-order singular value decomposition, and the multilinear principal component analysis as well as their linear counterparts, i.e., the non-negative matrix factorization, the singular value decomposition, and the principal components analysis in extracting features that are subsequently classified by either support vector machine or nearest neighbor classifiers. Three different sets of experiments conducted on the GTZAN and the ISMIR2004 Genre datasets demonstrate the superiority of NMPCA against the aforementioned subspace analysis techniques in extracting more discriminating features, especially when the training set has small cardinality. The best classification accuracies reported in the paper exceed those obtained by the state-of-the-art music genre classification algorithms applied to both datasets.

Keywords :

audio signal processing; modulation; music; pattern classification; principal component analysis; singular value decomposition; support vector machines; tensors; Grassmann manifold; auditory temporal modulations; bio-inspired 2D auditory representation; human auditory system; music genre classification; nearest neighbor classifiers; non-negative multilinear principal component analysis; non-negative tensor factorization; psychophysiological investigations; singular value decomposition; support vector machine; Auditory representations; music genre classification; non-negative multilinear principal components analysis (NMPCA); non-negative tensor factorization (NTF); nonnegative matrix factorization (NMF);

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2009.2036813

Filename :

5337979

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1336159