Title :
Automatic voice onset time estimation of stops in continuous speech
Author :
Lin, Chi-yueh ; Wang, Hsiao-Chuan
Author_Institution :
Dept. of Electr. Eng., Nat. Tsing Hua Univ., Hsinchu, Taiwan
fDate :
Nov. 29 2010-Dec. 3 2010
Abstract :
To annotate voice onset time (VOT) of stop consonants in a speech database, manually labeling is a feasible but time-consuming and tedious task. This paper proposed a fully-automatic VOT estimation method to alleviate this burden. The method relies on an HMM-based phone recognizer and a random forest (RF) based onset detector. The phone recognizer performs a forced alignment to locate stop consonants, and the onset detector searches each aligned stop segment for the onsets of burst and voicing. Then the time interval between these onsets is the estimated VOT for that stop consonant. The merit of the proposed method lies in the RF-based onset detector, which is able to provide accurate onset detection with only a small amount of training data. The proposed method was evaluated on the testing set in TIMIT database, which includes 2,344 word-initial and 1,440 word-medial stops. The experimental results revealed that 81.2% of the estimations deviate from the reference values within 10 ms, and 95.7% within 20 ms.
Keywords :
hidden Markov models; speech processing; speech recognition; HMM based phone recognizer; RF based onset detector; TIMIT database; annotate voice onset time; automatic voice onset time estimation; continuous speech; fully automatic VOT estimation method; random forest based onset detector; speech database; time interval; VOT; forced alignment; random forest; stop consonant; voice onset time;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location :
Tainan
Print_ISBN :
978-1-4244-6244-5
DOI :
10.1109/ISCSLP.2010.5684481