DocumentCode
312345
Title
Constructing multi-level speech database for spontaneous speech processing
Author
Hahn, Mimoo ; Kim, Sanghun ; Lee, Jung-Chul ; Lee, Yong-Ju
Author_Institution
Audio Inf. Process. Sect., Electron. & Telecommun. Res. Inst., Daejeon, South Korea
Volume
3
fYear
1996
fDate
3-6 Oct 1996
Firstpage
1930
Abstract
The paper describes a database called multi level speech database for spontaneous speech processing. We designed the database to cover textual and acoustic variations from declarative speech to spontaneous speech. The database is composed of 5 categories which are, in the order of decreasing spontaneity, spontaneous speech, interview, simulated interview, declarative speech with context, and declarative speech without context. We collected in total, 112 sets from 23 subjects (male: 19, female: 4). The database was firstly transcribed using 15 transcription symbols according to our own transcription rules. Secondly, prosodic information will be added. The goal of this research is a comparative textual and prosodic analysis at each level, quantification of spontaneity of diversified speech database for dialogue speech synthesis and recognition. From the preliminary analysis of transcribed texts, the spontaneous speech has more corrections, repetitions, and pauses than the others as expected. In addition, the average number of sentences per turn of spontaneous speech is greater than the others. From the above results, we can quantify the spontaneity of the speech database
Keywords
database management systems; interactive systems; natural language interfaces; speech processing; speech synthesis; acoustic variations; declarative speech; dialogue speech synthesis; diversified speech database; interview; multi level speech database; prosodic analysis; prosodic information; simulated interview; spontaneous speech processing; transcribed texts; transcription rules; transcription symbols; Context modeling; Databases; Information processing; Natural languages; Process design; Speech analysis; Speech processing; Speech recognition; Speech synthesis; Synthesizers;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location
Philadelphia, PA
Print_ISBN
0-7803-3555-4
Type
conf
DOI
10.1109/ICSLP.1996.608012
Filename
608012
Link To Document