DocumentCode :
3528687
Title :
Audio segmentation for speech recognition using segment features
Author :
Rybach, David ; Gollan, Christian ; Schluter, Ralf ; Ney, Hermann
Author_Institution :
Comput. Sci. Dept., RWTH Aachen Univ., Aachen
fYear :
2009
fDate :
19-24 April 2009
Firstpage :
4197
Lastpage :
4200
Abstract :
Audio segmentation is an essential preprocessing step in several audio processing applications with a significant impact e.g. on speech recognition performance. We introduce a novel framework which combines the advantages of different well known segmentation methods. An automatically estimated log-linear segment model is used to determine the segmentation of an audio stream in a holistic way by a maximum a posteriori decoding strategy, instead of classifying change points locally. A comparison to other segmentation techniques in terms of speech recognition performance is presented, showing a promising segmentation quality of our approach.
Keywords :
audio streaming; maximum likelihood estimation; speech coding; speech recognition; audio processing; audio segmentation; audio stream; log-linear segment model; maximum a posteriori decoding; segment features; speech recognition; Automatic speech recognition; Bayesian methods; Broadcasting; Decoding; Humans; Loudspeakers; Natural languages; Pattern recognition; Speech recognition; Streaming media; audio segmentation; broadcast news transcription; speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
ISSN :
1520-6149
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2009.4960554
Filename :
4960554
Link To Document :
بازگشت