مرکز منطقه ای اطلاع رساني علوم و فناوري - Automatic phonetic segmentation of Malay speech database

DocumentCode :

2971489

Title :

Automatic phonetic segmentation of Malay speech database

Author :

Ting, Chee-Ming ; Salleh, Sh-Hussain ; Tan, Tian-Swee ; Ariff, A.K.

Author_Institution :

Fac. of Electr. Univ., Johor

fYear :

2007

fDate :

10-13 Dec. 2007

Firstpage :

Lastpage :

Abstract :

This paper deals with automatic phonetic segmentation for Malay continuous speech. This study investigates fast and automatic phone segmentation in preparing database for Malay concatenative Text-to-Speech (TTS) systems. A 35 Malay phone set has been chosen, which is suitable for building Malay TTS. The segmentation experiment is based on this phone set. HMM based segmentation approach which uses Viterbi force alignment technique is adapted. We use continuous density HMM (CDHMM) with Gaussian mixture which is performs well in speech recognition to prevent large segmentation errors. Besides, this paper presents an implicit boundary refinement method that is incorporated in the Viterbi phonetic alignment. In this approach, the HMM model is trained with phone tokens with their boundaries extended to the be-side phones. This increases the ability of the HMM in modeling phone boundaries and provides effect of implicit boundary refinement when used in phonetic alignment thus reduce segmentation errors. This approach improves increase the performance of baseline HMM segmentation from 42.39%, 74.83%, 84.34% of automatic boundary marks within error smaller than 5, 15, and 25ms to 47.75%, 76.38%, 85.55%.

Keywords :

hidden Markov models; speech processing; speech synthesis; Gaussian mixture; HMM based segmentation; Malay concatenative text-to-speech systems; Malay continuous speech; Malay phone set; Malay speech database; Viterbi force alignment technique; Viterbi phonetic alignment; automatic phone segmentation; automatic phonetic segmentation; baseline HMM segmentation; be-side phones; continuous density HMM; hidden Markov models; implicit boundary refinement method; phone boundaries; phone tokens; segmentation errors; speech recognition; speech synthesis; Automatic speech recognition; Biomedical engineering; Databases; Feature extraction; Hidden Markov models; Natural languages; Speech recognition; Speech synthesis; Training data; Viterbi algorithm; Speech recognition; Speech synthesis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information, Communications & Signal Processing, 2007 6th International Conference on

Conference_Location :

Singapore

Print_ISBN :

978-1-4244-0982-2

Electronic_ISBN :

978-1-4244-0983-9

Type :

conf

DOI :

10.1109/ICICS.2007.4449574

Filename :

4449574

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2971489