DocumentCode
3167906
Title
Automatic caption generation for video data. Time alignment between caption and acoustic signal
Author
Watanabe, K. ; Sugiyama, M.
Author_Institution
Graduate Sch. of Comput. Sci. & Eng., Aizu Univ., Fukushima, Japan
fYear
1999
fDate
1999
Firstpage
65
Lastpage
70
Abstract
This paper discusses automatic caption generation, and specifically focuses on correspondence between Japanese text and its speech data. This paper proposes the time alignment module implemented using DP matching and evaluates its performance. Optimizing weight and DP path, the caption display time gap between correct and estimated is less than 39.0 ms in the phoneme boundary. Effects of other speaker´s phoneme templates and text phrase deletion are evaluated
Keywords
handicapped aids; speech recognition; video signal processing; DP matching; Japanese text; acoustic signal; automatic caption generation; performance evaluation; phoneme templates; speech data; text phrase deletion; time alignment; video data; Acoustical engineering; Auditory system; Computer science; Costs; Data engineering; Displays; Signal generators; Speech; TV broadcasting; Timing;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia Signal Processing, 1999 IEEE 3rd Workshop on
Conference_Location
Copenhagen
Print_ISBN
0-7803-5610-1
Type
conf
DOI
10.1109/MMSP.1999.793799
Filename
793799
Link To Document