مرکز منطقه ای اطلاع رساني علوم و فناوري - Unsupervised learning of auditory filter banks using non-negative matrix factorisation

DocumentCode :

3425542

Title :

Unsupervised learning of auditory filter banks using non-negative matrix factorisation

Author :

Bertrand, Alexander ; Demuynck, Kris ; Stouten, Veronique ; Hamme, Hugo Van

Author_Institution :

Dept. ESAT Kasteelpark Arenberg 10, Katholieke Univ. Leuven, Leuven

fYear :

2008

fDate :

March 31 2008-April 4 2008

Firstpage :

4713

Lastpage :

4716

Abstract :

Non-negative matrix factorisation (NMF) is an unsupervised learning technique that decomposes a non-negative data matrix into a product of two lower rank non-negative matrices. The non-negativity constraint results in a parts-based and often sparse representation of the data. We use NMF to factorise a matrix with spectral slices of continuous speech to automatically find a feature set for speech recognition. The resulting decomposition yields a filter bank design with remarkable similarities to perceptually motivated designs, supporting the hypothesis that human hearing and speech production are well matched to each other. We point out that the divergence cost criterion used by NMF is linearly dependent on energy, which may influence the design. We will however argue that this does not significantly affect the interpretation of our results. Furthermore, we compare our filter bank with several hearing models found in literature. Evaluating the filter bank for speech recognition shows that the same recognition performance is achieved as with classical MEL- based features.

Keywords :

channel bank filters; matrix decomposition; speech processing; speech recognition; unsupervised learning; auditory filter banks; continuous speech; hearing models; nonnegative data matrix; nonnegative matrix factorisation; nonnegativity constraint; speech recognition; unsupervised learning; Auditory system; Channel bank filters; Filter bank; Humans; Matrix decomposition; Sparse matrices; Speech analysis; Speech recognition; Unsupervised learning; Vectors; Auditory system; Feature extraction; Non-negative matrix decomposition; Speech analysis; Unsupervised learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location :

Las Vegas, NV

ISSN :

1520-6149

Print_ISBN :

978-1-4244-1483-3

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2008.4518709

Filename :

4518709

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3425542