DocumentCode :
2759512
Title :
Learning the Intrinsic Dimensions of the Timit Speech Database with Maximum Variance Unfolding
Author :
Vasiloglou, Nikolaos ; Anderson, David V. ; Gray, Alexander G.
Author_Institution :
Georgia Inst. of Technol., Atlanta, GA
fYear :
2009
fDate :
4-7 Jan. 2009
Firstpage :
48
Lastpage :
53
Abstract :
Modern methods for nonlinear dimensionality reduction have been used extensively in the machine learning community for discovering the intrinsic dimension of several datasets. In this paper we apply one of the most successful ones maximum variance unfolding on a big sample of the well known speech benchmark TIMIT. Although MVU is not generally scalable, we managed to apply to 1 million 39-dimensional points and successfully reduced the dimension down to 15. In this paper we apply some of the state-of-the-art techniques for handling big datasets. The biggest bottleneck is the local neighborhood computation. For 300 K points it took 9 hours while for 1 M points it took 3.5 days. We also demonstrate the weakness of MFCC representation under the k-nearest neighborhood classification since the error rate is more than 50%.
Keywords :
audio databases; learning (artificial intelligence); signal classification; speech recognition; MFCC representation; TIMIT speech database; intrinsic dimensions; k-nearest neighborhood classification; machine learning; maximum variance unfolding; nonlinear dimensionality reduction; speech recognition; Cepstrum; Databases; Error analysis; Machine learning; Manifolds; Mel frequency cepstral coefficient; Search engines; Smoothing methods; Speech recognition; Testing; Dimensionality Reduction; Manifold Learning; Maximum Variance Unfolding; Mel Cepstrum; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop, 2009. DSP/SPE 2009. IEEE 13th
Conference_Location :
Marco Island, FL
Print_ISBN :
978-1-4244-3677-4
Electronic_ISBN :
978-1-4244-3677-4
Type :
conf
DOI :
10.1109/DSP.2009.4785894
Filename :
4785894
Link To Document :
بازگشت