مرکز منطقه ای اطلاع رساني علوم و فناوري - Multi-view learning with supervision for transformed bottleneck features

DocumentCode :

178390

Title :

Multi-view learning with supervision for transformed bottleneck features

Author :

Arora, Rajkumar ; Livescu, Karen

Author_Institution :

TTI-Chicago, Chicago, IL, USA

fYear :

2014

fDate :

4-9 May 2014

Firstpage :

2499

Lastpage :

2503

Abstract :

Previous work has shown that acoustic features can be improved by unsupervised learning of transformations based on canonical correlation analysis (CCA) using articulatory measurements that are available at training time. In this paper, we investigate whether this second view (articulatory data) still helps even when labels are also available at training time. We begin with strong baseline bottleneck features, which can be learned when the training set is phonetically labeled. We then compare several options for learning transformations of the bottleneck features in the presence of both articulatory measurements and phonetic labels for the training data. The methods compared include combinations of LDA and CCA, as well as a three-view extension of CCA that simultaneously uses the labels and articulatory measurements as additional views. Phonetic recognition experiments on data from the University of Wisconsin X-ray microbeam database show that the learned features improve performance over using either just the labels or just the articulatory measurements for learning acoustic transformations.

Keywords :

acoustic signal processing; correlation theory; speech recognition; unsupervised learning; CCA; LDA; Wisconsin University; X-ray microbeam database; acoustic features; acoustic transformations; articulatory data; articulatory measurements; baseline bottleneck features; canonical correlation analysis; linear discriminant analysis; multiview learning; phonetic labels; phonetic recognition; supervision; three-view extension; training time; transformed bottleneck features; unsupervised learning; Acoustic measurements; Correlation; Mel frequency cepstral coefficient; Speech; Speech recognition; Training; articulatory measurements; bottleneck features; canonical correlation analysis; multi-view learning; supervised transformation learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location :

Florence

Type :

conf

DOI :

10.1109/ICASSP.2014.6854050

Filename :

6854050

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=178390