• DocumentCode
    3530701
  • Title

    Maximizing global entropy reduction for active learning in speech recognition

  • Author

    Varadarajan, Balakrishnan ; Yu, Dong ; Deng, Li ; Acero, Alex

  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4721
  • Lastpage
    4724
  • Abstract
    We propose a new active learning algorithm to address the problem of selecting a limited subset of utterances for transcribing from a large amount of unlabeled utterances so that the accuracy of the automatic speech recognition system can be maximized. Our algorithm differentiates itself from earlier work in that it uses a criterion that maximizes the lattice entropy reduction over the whole dataset. We introduce our criterion, show how it can be simplified and approximated, and describe the detailed algorithm to optimize the criterion. We demonstrate the effectiveness of our new algorithm with directory assistance data collected under the real usage scenarios and show that our new algorithm consistently outperforms the confidence based approach by a significant margin. Using the algorithm cuts the number of utterances needed for transcribing by 50% to achieve the same recognition accuracy obtained using the confidence-based approach, and by 60% compared to the random sampling approach.
  • Keywords
    learning (artificial intelligence); maximum entropy methods; speech recognition; active learning algorithm; automatic speech recognition system; confidence-based approach; global entropy reduction maximization; Acoustic testing; Automatic speech recognition; Databases; Decoding; Electrostatic precipitators; Entropy; Error analysis; Lattices; Sampling methods; Speech recognition; Active learning; acoustic model; confidence; entropy; lattice;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960685
  • Filename
    4960685