• DocumentCode
    1133495
  • Title

    Graceful degradation of speech recognition performance over packet-erasure networks

  • Author

    Boulis, Constantinos ; Ostendorf, Mari ; Riskin, Eve A. ; Otterson, Scott

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Washington, Seattle, WA, USA
  • Volume
    10
  • Issue
    8
  • fYear
    2002
  • fDate
    11/1/2002 12:00:00 AM
  • Firstpage
    580
  • Lastpage
    590
  • Abstract
    This paper explores packet loss recovery for automatic speech recognition (ASR) in spoken dialog systems, assuming an architecture in which a lightweight client communicates with a remote ASR server. Speech is transmitted with source and channel codes optimized for the ASR application, i.e., to minimize word error rate. Unequal amounts of forward error correction, depending on the data´s effect on ASR performance, are assigned to protect against packet loss. Experiments with simulated packet loss in a range of loss conditions are conducted on the DARPA Communicator (air travel information) task. Results show that the approach provides robust ASR performance which degrades gracefully as packet loss rates increase. Transmitting at 5.2 Kbps with up to 200 ms added delay, leads to only a 7% relative degradation in word error rate even under extremely adverse network conditions.
  • Keywords
    forward error correction; mobile radio; packet radio networks; source coding; speech recognition; voice communication; 200 ms; 5.2 Kbit/s; ASR server; DARPA Communicator task; air travel information task; automatic speech recognition; delay; forward error correction; graceful degradation; packet loss rates; packet loss recovery; packet-erasure networks; portable devices; source coding; speech recognition performance; speech transmission; spoken dialog systems; vector quantization; word error rate; Automatic speech recognition; Degradation; Error analysis; Forward error correction; Network servers; Performance loss; Protection; Robustness; Speech coding; Speech recognition;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/TSA.2002.804532
  • Filename
    1175530