• DocumentCode
    178388
  • Title

    A pitch extraction algorithm tuned for automatic speech recognition

  • Author

    Ghahremani, Pegah ; BabaAli, Bagher ; Povey, Daniel ; Riedhammer, Korbinian ; Trmal, Jan ; Khudanpur, Sanjeev

  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    2494
  • Lastpage
    2498
  • Abstract
    In this paper we present an algorithm that produces pitch and probability-of-voicing estimates for use as features in automatic speech recognition systems. These features give large performance improvements on tonal languages for ASR systems, and even substantial improvements for non-tonal languages. Our method, which we are calling the Kaldi pitch tracker (because we are adding it to the Kaldi ASR toolkit), is a highly modified version of the getf0 (RAPT) algorithm. Unlike the original getf0 we do not make a hard decision whether any given frame is voiced or unvoiced; instead, we assign a pitch even to unvoiced frames while constraining the pitch trajectory to be continuous. Our algorithm also produces a quantity that can be used as a probability of voicing measure; it is based on the normalized autocorrelation measure that our pitch extractor uses. We present results on data from various languages in the BABEL project, and show a large improvement over systems without tonal features and systems where pitch and POV information was obtained from SAcC or getf0.
  • Keywords
    feature extraction; probability; speech recognition; ASR systems; BABEL project; Kaldi ASR toolkit; Kaldi pitch tracker; POV information; RAPT algorithm; automatic speech recognition systems; getf0 algorithm; nontonal languages; normalized autocorrelation measure; pitch information; pitch trajectory; probability-of-voicing estimates; unvoiced frames; voicing measure probability; Acoustics; Conferences; Feature extraction; Indexes; Signal processing algorithms; Speech; Speech recognition; Automatic Speech Recognition; Pitch; Probability Of Voicing; Tone;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854049
  • Filename
    6854049