• DocumentCode
    417261
  • Title

    Optimizing acoustic models for commercial speech recognition using foreground scores and data weighting

  • Author

    Boies, Daniel ; Strope, Brian ; Weintraub, Mitchel ; Wu, Su-Lin

  • Author_Institution
    Nuance Commun., Menlo Park, CA, USA
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    This paper describes a data-driven technique for optimizing the acoustic models for speech recognition systems that target commercial applications over telephones. Frame-averaged foreground log-likelihoods (foreground scores) correlate to recognition errors. These scores are used together with gender to optimize data weighting for the acoustic model. This process is interpreted as increasing the priors and associated parameters for poorly modeled data. The score-based optimization leads to about 7% fewer semantic errors on a live evaluation set collected after the last data used to estimate the acoustic model.
  • Keywords
    error statistics; maximum likelihood estimation; optimisation; speech recognition; telephony; acoustic models; commercial speech recognition; data weighting; data-driven technique; foreground scores; frame-averaged foreground log-likelihoods; gender; optimization; recognition errors; semantic errors; telephones; Acoustic applications; Boosting; Degradation; Error analysis; Maximum likelihood estimation; Real time systems; Speech recognition; Statistics; Telephony; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326111
  • Filename
    1326111