مرکز منطقه ای اطلاع رساني علوم و فناوري - Using visual information in automatic speech segmentation

DocumentCode :

3629096

Title :

Using visual information in automatic speech segmentation

Author :

Eren Akdemir;Tolga Ciloglu

Author_Institution :

Elektrik ve Elektronik M?hendisli?i B?l?m?, Orta Do?u Teknik ?niversitesi, Turkey

fYear :

2008

fDate :

4/1/2008 12:00:00 AM

Firstpage :

Lastpage :

Abstract :

In this study, the use of visual information in automatic speech segmentation is investigated. Automatic speech segmentation is an essential task in speech processing systems. It is needed in speech recognition systems for training in speech synthesis systems for obtaining appropriate data and etc. The motions of upper and lower lips are incorporated into a hidden Markov model based segmentation process. MOCHA-TIMIT database, which involves simultaneous articulatograph and microphone recordings, was used to develop and test the models. Different feature vector compositions are proposed for incorporation of visual parameters to the automatic segmentation system. Average error of the system with respect to manual segmentation is decreased by 10.1%. The results are examined in a boundary-class dependent manner, and the performance of the system in different boundary types is discussed. After analyzing the boundary-class dependent performance, the system performance is increased by 12.1% by using the feature vector in only selected boundaries.

Keywords :

"Hidden Markov models","Mel frequency cepstral coefficient","Speech","Motion segmentation","Speech processing","Visualization","Speech recognition"

Publisher :

ieee

Conference_Titel :

Signal Processing, Communication and Applications Conference, 2008. SIU 2008. IEEE 16th

ISSN :

2165-0608

Print_ISBN :

978-1-4244-1998-2

Type :

conf

DOI :

10.1109/SIU.2008.4632641

Filename :

4632641

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3629096