DocumentCode
2177436
Title
Enriching Mandarin speech recognition by incorporating a hierarchical prosody model
Author
Yang, Jyh-Her ; Liu, Ming-Chieh ; Chang, Hao-Hsiang ; Chiang, Chen-Yu ; Wang, Yih-Ru ; Chen, Sin-Horng
Author_Institution
Dept. of Commun. Eng., Nat. Chiao Tung Univ., Hsinchu, Taiwan
fYear
2011
fDate
22-27 May 2011
Firstpage
5052
Lastpage
5055
Abstract
This paper presents a new probabilistic framework of Mandarin speech recognition by incorporating a sophisticated hierarchical prosody model into the conventional HMM-based system. The prosody model describes the relations of linguistic cues of various levels, break types and prosodic states which represent the prosody hierarchical structure, and prosody-related acoustic features. Aside from producing the recognized word sequences, the system also decodes other information including word´s part-of-speech, punctuation marks, inter-syllable break types, and prosodic states of syllables. Experimental results on the TCC300 corpus, which consists of paragraphic utterances, showed that the proposed system significantly outperformed the baseline system. The word and character error rates decreased from 24.4% and 18.1% to 20.7% and 14.4% (or 15.2% and 20.4% relative improvements), respectively.
Keywords
hidden Markov models; speech recognition; HMM-based system; TCC300 corpus; baseline system; hierarchical prosody model; intersyllable break types; mandarin speech recognition; punctuation marks; Acoustics; Decoding; Error analysis; Hidden Markov models; Pragmatics; Speech; Speech recognition; Hierarchical prosody model; Mandarin speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location
Prague
ISSN
1520-6149
Print_ISBN
978-1-4577-0538-0
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2011.5947492
Filename
5947492
Link To Document