• DocumentCode
    178390
  • Title

    Multi-view learning with supervision for transformed bottleneck features

  • Author

    Arora, Rajkumar ; Livescu, Karen

  • Author_Institution
    TTI-Chicago, Chicago, IL, USA
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    2499
  • Lastpage
    2503
  • Abstract
    Previous work has shown that acoustic features can be improved by unsupervised learning of transformations based on canonical correlation analysis (CCA) using articulatory measurements that are available at training time. In this paper, we investigate whether this second view (articulatory data) still helps even when labels are also available at training time. We begin with strong baseline bottleneck features, which can be learned when the training set is phonetically labeled. We then compare several options for learning transformations of the bottleneck features in the presence of both articulatory measurements and phonetic labels for the training data. The methods compared include combinations of LDA and CCA, as well as a three-view extension of CCA that simultaneously uses the labels and articulatory measurements as additional views. Phonetic recognition experiments on data from the University of Wisconsin X-ray microbeam database show that the learned features improve performance over using either just the labels or just the articulatory measurements for learning acoustic transformations.
  • Keywords
    acoustic signal processing; correlation theory; speech recognition; unsupervised learning; CCA; LDA; Wisconsin University; X-ray microbeam database; acoustic features; acoustic transformations; articulatory data; articulatory measurements; baseline bottleneck features; canonical correlation analysis; linear discriminant analysis; multiview learning; phonetic labels; phonetic recognition; supervision; three-view extension; training time; transformed bottleneck features; unsupervised learning; Acoustic measurements; Correlation; Mel frequency cepstral coefficient; Speech; Speech recognition; Training; articulatory measurements; bottleneck features; canonical correlation analysis; multi-view learning; supervised transformation learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854050
  • Filename
    6854050