A hybrid architecture for automatic segmentation of speech waveforms

Author

Mporas, Iosif ; Ganchev, Todor ; Fakotakis, Nikos

Author_Institution

Dept. Electr. & Comput. Eng., Patras Univ., Rio Patras

fYear

2008

fDate

March 31 2008-April 4 2008

Firstpage

4457

Lastpage

4460

Abstract

In the present work, we propose a hybrid architecture for automatic alignment of speech waveforms and their corresponding phone sequence. The proposed architecture does not exploit any phone boundary information. Our approach combines the efficiency of embedded training techniques and the high performance of isolated-unit training. Evaluating on the established for the task of phone segmentation TIMIT database, we achieved an accuracy of 83.56%, which corresponds to improving the baseline system´s accuracy by 6.09 %.

Keywords

speech processing; TIMIT database; automatic alignment; automatic speech waveform segmentation; embedded training techniques; isolated-unit training; phone boundary information; phone sequence; Artificial intelligence; Computer architecture; Databases; Feature extraction; Hidden Markov models; Natural languages; Speech recognition; Text recognition; Viterbi algorithm; Wire; Speech segmentation; embedded training; hidden Markov models; isolated-unit training;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on

Conference_Location

Las Vegas, NV

ISSN

1520-6149

Print_ISBN

978-1-4244-1483-3

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2008.4518645

Filename

4518645