A pitch extraction algorithm tuned for automatic speech recognition

Author

Ghahremani, Pegah ; BabaAli, Bagher ; Povey, Daniel ; Riedhammer, Korbinian ; Trmal, Jan ; Khudanpur, Sanjeev

fYear

2014

fDate

4-9 May 2014

Firstpage

2494

Lastpage

2498

Abstract

In this paper we present an algorithm that produces pitch and probability-of-voicing estimates for use as features in automatic speech recognition systems. These features give large performance improvements on tonal languages for ASR systems, and even substantial improvements for non-tonal languages. Our method, which we are calling the Kaldi pitch tracker (because we are adding it to the Kaldi ASR toolkit), is a highly modified version of the getf0 (RAPT) algorithm. Unlike the original getf0 we do not make a hard decision whether any given frame is voiced or unvoiced; instead, we assign a pitch even to unvoiced frames while constraining the pitch trajectory to be continuous. Our algorithm also produces a quantity that can be used as a probability of voicing measure; it is based on the normalized autocorrelation measure that our pitch extractor uses. We present results on data from various languages in the BABEL project, and show a large improvement over systems without tonal features and systems where pitch and POV information was obtained from SAcC or getf0.

Keywords

feature extraction; probability; speech recognition; ASR systems; BABEL project; Kaldi ASR toolkit; Kaldi pitch tracker; POV information; RAPT algorithm; automatic speech recognition systems; getf0 algorithm; nontonal languages; normalized autocorrelation measure; pitch information; pitch trajectory; probability-of-voicing estimates; unvoiced frames; voicing measure probability; Acoustics; Conferences; Feature extraction; Indexes; Signal processing algorithms; Speech; Speech recognition; Automatic Speech Recognition; Pitch; Probability Of Voicing; Tone;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location

Florence

Type

conf

DOI

10.1109/ICASSP.2014.6854049

Filename

6854049