DocumentCode :
454577
Title :
A Database of Vocal Tract Resonance Trajectories for Research in Speech Processing
Author :
Li Deng ; Xiaodong Cui ; Pruvenok, R. ; Yanyi Chen ; Momen, S. ; Alwan, Abeer
Author_Institution :
Microsoft Res., Redmond, WA
Volume :
1
fYear :
2006
fDate :
14-19 May 2006
Abstract :
While vocal tract resonances (VTRs, or formants that are defined as such resonances) are known to play a critical role in human speech perception and in computer speech processing, there has been a lack of standard databases needed for the quantitative evaluation of automatic VTR extraction techniques. We report in this paper on our recent effort to create a publicly available database of the first three VTR frequency trajectories. The database contains a representative subset of the TEMIT corpus with respect to speaker, gender, dialect and phonetic context, with a total of 538 sentences. A Matlab-based labeling tool is developed, with high-resolution wideband spectrograms displayed to assist in visual identification of VTR frequency values which are then recorded via mouse clicks and local spline interpolation. Special attention is paid to VTR values during consonant-to-vowel (CV) and vowel-to-consonant (VC) transitions, and to speech segments with vocal tract anti-resonances. Using this database, we quantitatively assess two common automatic VTR tracking techniques in terms of their average tracking errors analyzed within each of the six major broad phonetic classes as well as during CV and VC transitions. The potential use of the VTR database for research in several areas of speech processing is discussed
Keywords :
audio databases; speech processing; Matlab-based labeling tool; consonant-to-vowel transitions; database; high-resolution wideband spectrograms; human speech perception; speech processing; vocal tract resonance trajectories; vowel-to-consonant transitions; Computer languages; Frequency; Humans; Labeling; Resonance; Speech analysis; Speech processing; Video recording; Virtual colonoscopy; Visual databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
ISSN :
1520-6149
Print_ISBN :
1-4244-0469-X
Type :
conf
DOI :
10.1109/ICASSP.2006.1660034
Filename :
1660034
Link To Document :
بازگشت