• DocumentCode
    730718
  • Title

    Multi-frame factorisation for long-span acoustic modelling

  • Author

    Liang Lu ; Renals, Steve

  • Author_Institution
    Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4595
  • Lastpage
    4599
  • Abstract
    Acoustic models based on Gaussian mixture models (GMMs) typically use short span acoustic feature inputs. This does not capture long-term temporal information from speech owing to the conditional independence assumption of hidden Markov models. In this paper, we present an implicit approach that approximates the joint distribution of long span features by product of factorized models, in contrast to deep neural networks (DNNs) that model feature correlations directly. The approach is applicable to a broad range of acoustic models. We present experiments using GMM and probabilistic linear discriminant analysis (PLDA) based models on Switchboard, observing consistent word error rate reductions.
  • Keywords
    Gaussian processes; hidden Markov models; mixture models; neural nets; probability; speech processing; GMM; Gaussian mixture models; deep neural networks; hidden Markov models; long-span acoustic modelling; long-term temporal information; multiframe factorisation; probabilistic linear discriminant analysis; word error rate reductions; Hidden Markov models; Joints; Mel frequency cepstral coefficient; Speech; Speech recognition; Switches; Acoustic modelling; long span features; multi-frame factorisation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178841
  • Filename
    7178841