DocumentCode :
1101663
Title :
On temporal alignment of sentences of natural and synthetic speech
Author :
Höhne, Hans D. ; Coker, Cecil ; Levinson, Stephen E. ; Rabiner, Lawrence R.
Author_Institution :
Technische Universitat, Berlin, West Germany
Volume :
31
Issue :
4
fYear :
1983
fDate :
8/1/1983 12:00:00 AM
Firstpage :
807
Lastpage :
813
Abstract :
One way to improve the quality of synthetic speech, and to learn about temporal aspects of speech recognition, is to study the problem of time aligning pairs of spoken sentences. For example, one could evaluate various sets of duration rules for synthesis by comparing the time alignments of speech sounds within synthetic sentences to those of naturally spoken sentences. In this manner, an improved set of sound duration rules could be obtained by applying some objective measure to the alignment scores. For speech recognition applications, one could obtain automatic labeling of continuous speech from a hand-marked prototype to obtain models and/or statistical data on sounds within sentences. A key question in the use of automatic alignment of sentence length utterances is whether the time warping methods, developed for isolated word recognition, could be extended to the problem of time aligning sentence length utterances (up to several seconds long). A second key question is the reliability and accuracy of such an alignment. In this paper we investigate these questions. It is shown that, with some simple modifications, the dynamic time warping procedures used for isolated word recognition apply almost as well to alignment of sentence length utterances. It is also shown that, on the average, the uncertainty in the location of significant events within the sentence is much smaller than the event durations although the largest errors are longer than some event durations. Hence, one must apply caution in using the time alignment contour for synthesis or recognition applications.
Keywords :
Acoustic testing; Automatic testing; Helium; Labeling; Prototypes; Speech analysis; Speech recognition; Speech synthesis; Synthesizers; Uncertainty;
fLanguage :
English
Journal_Title :
Acoustics, Speech and Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
0096-3518
Type :
jour
DOI :
10.1109/TASSP.1983.1164174
Filename :
1164174
Link To Document :
بازگشت