مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio segmentation for speech recognition using segment features

DocumentCode :

3528687

Title :

Audio segmentation for speech recognition using segment features

Author :

Rybach, David ; Gollan, Christian ; Schluter, Ralf ; Ney, Hermann

Author_Institution :

Comput. Sci. Dept., RWTH Aachen Univ., Aachen

fYear :

2009

fDate :

19-24 April 2009

Firstpage :

4197

Lastpage :

4200

Abstract :

Audio segmentation is an essential preprocessing step in several audio processing applications with a significant impact e.g. on speech recognition performance. We introduce a novel framework which combines the advantages of different well known segmentation methods. An automatically estimated log-linear segment model is used to determine the segmentation of an audio stream in a holistic way by a maximum a posteriori decoding strategy, instead of classifying change points locally. A comparison to other segmentation techniques in terms of speech recognition performance is presented, showing a promising segmentation quality of our approach.

Keywords :

audio streaming; maximum likelihood estimation; speech coding; speech recognition; audio processing; audio segmentation; audio stream; log-linear segment model; maximum a posteriori decoding; segment features; speech recognition; Automatic speech recognition; Bayesian methods; Broadcasting; Decoding; Humans; Loudspeakers; Natural languages; Pattern recognition; Speech recognition; Streaming media; audio segmentation; broadcast news transcription; speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location :

Taipei

ISSN :

1520-6149

Print_ISBN :

978-1-4244-2353-8

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2009.4960554

Filename :

4960554

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3528687