• DocumentCode
    1688022
  • Title

    Robustness to speaker position in distant-talking automatic speech recognition

  • Author

    Gomez, Raquel ; Nakamura, Kentaro ; Nakadai, Kazuhiro

  • Author_Institution
    Honda Res. Inst. Japan Co. Ltd., Japan
  • fYear
    2013
  • Firstpage
    7034
  • Lastpage
    7038
  • Abstract
    In this paper, we show a method that significantly improved our previous work in single-channel dereverberation. The proposed method is more robust to changes in speaker position in distant talking ASR. First, we update the room transfer function (RTF) and weighting parameters for dereverberation to the target speaker position. This scheme corrects speech power variation as a function of position in the waveform level. Consequently, its impact to the acoustic model is verified. Then, we implement a fast acoustic model update reflective of the speech power level of the target speaker position. Furthermore, the scheme in updating the model is simple and precludes time-consuming model re-estimation. As a result, the proposed method can be executed online. The synergy of these corrective measures significantly minimizes the mismatch between training and testing conditions. We test our method using real reverberant data with different locations inside the room. Experimental results show that the proposed method outperforms the conventional methods in terms of ASR performance. Moreover, our fast acoustic model update scheme is at par in terms of recognition performance against time-consuming model re-estimation.
  • Keywords
    reverberation; speaker recognition; speech enhancement; RTF; acoustic model; distant talking ASR; distant-talking automatic speech recognition; fast acoustic model update scheme; room transfer function; single-channel dereverberation; speaker position; speech enhancement; speech power level; speech power variation; testing conditions; time-consuming model reestimation; training conditions; waveform level; weighting parameters; Acoustics; Adaptation models; Data models; Hidden Markov models; Robustness; Speech; Speech enhancement; Automatic Speech Recognition; Dereverberation; Robustness; Speech Enhancement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639026
  • Filename
    6639026