On the generation and use of a segment dictionary for speech coding, synthesis and recognition

Author

Chollet, G. ; Galliano, J.F. ; Lefevre, J.P. ; Viara, E.

Author_Institution

ENST, Paris, Cedex

Volume

fYear

1983

fDate

30407

Firstpage

1328

Lastpage

1331

Abstract

A methodology is described to obtain a set of segments and rules that represents adequately the speech performance of a given speaker. This methodology proceeds from an initial set of diphones extracted from a neutral context and modify this set with larger and/or smaller segments depending on the match with natural utterances. Each segment is stored as a sequence of frames coded using LPC coefficients. An estimate of the likelihood of timescale distortion is associated with each frame. It represents knowledge on temporal variability that can be used by synthesis rules and/or pattern matching algorithms. It is then shown how such a segment data base can be used for 1) speech coding at very low bit rate ( ∼ 400 bit/sec), 2) synthesis from unrestricted text, 3) continuous speech recognition.

Keywords

Acoustic distortion; Bit rate; Costs; Dictionaries; Linear predictive coding; Pattern matching; Speech coding; Speech processing; Speech recognition; Speech synthesis;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '83.

Type

conf

DOI

10.1109/ICASSP.1983.1172018

Filename

1172018

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3062838