DocumentCode :
673304
Title :
Multimodal English corpus for automatic speech recognition
Author :
Kunka, Bartosz ; Kupryjanow, Adam ; Dalka, Piotr ; Bratoszewski, Piotr ; Szczodrak, M. ; Spaleniak, Pawel ; Szykulski, Marcin ; Czyzewski, Andrzej
Author_Institution :
Multimedia Syst. Dept. (MSD), Gdansk Univ. of Technol. (GUT), Gdansk, Poland
fYear :
2013
fDate :
26-28 Sept. 2013
Firstpage :
106
Lastpage :
111
Abstract :
A multimodal corpus developed for research of speech recognition based on audio-visual data is presented. Besides usual video and sound excerpts, the prepared database contains also thermovision images and depth maps. All streams were recorded simultaneously, therefore the corpus enables to examine the importance of the information provided by different modalities. Based on the recordings, it is also possible to develop a speech recognition system which analyzes many modalities at the same time. The paper describes the process of multimodal material collection and the post-processing procedure applied to this material. Parameterization methods of signals belonging to different modalities are also proposed.
Keywords :
audio-visual systems; speech recognition; audio-visual data; automatic speech recognition; database; depth maps; multimodal English corpus; multimodal material collection; parameterization methods; post-processing procedure; sound excerpts; thermovision images; video excerpts; Cameras; Image recognition; Labeling; Silicon; Spatial resolution; Stereo image processing; ASR system; English corpus; audio-video recordings database; audio-visual speech recognition system; automatic speech recognition; multi-stream corpus; multimodal corpus;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 2013
Conference_Location :
Poznan
ISSN :
2326-0262
Electronic_ISBN :
2326-0262
Type :
conf
Filename :
6710606
Link To Document :
بازگشت