Semantic Segmentation of Documentary Video using Music Breaks

Author

Dong, Aijuan ; Li, Honglin

Author_Institution

Dept. of Comput. Sci., North Dakota State Univ., Fargo, ND

fYear

2006

fDate

9-12 July 2006

Firstpage

1825

Lastpage

1828

Abstract

Many documentary videos use background music to help structure the content and communicate the semantic. In this paper, we investigate semantic segmentation of documentary video using music breaks. We first define video semantic units based on the speech text that a video/audio contains, and then propose a three-step procedure for semantic video segmentation using music breaks. Since the music breaks of a documentary video are of different semantic levels, we also study how different speech/music segment lengths correlate with the semantic level of a music break. Our experimental results show that music breaks can effectively segment a continuous documentary video stream into semantic units with an average F-score of 0.91 and the lengths of combined segments (speech segment plus the music segment that follows) strongly correlate with the semantic levels of music breaks

Keywords

document image processing; image segmentation; music; speech processing; video streaming; documentary video stream; music break; semantic video segmentation; speech text; Automatic speech recognition; Bandwidth; Computer networks; Computer science; Indexing; Mel frequency cepstral coefficient; Music; Neodymium; Streaming media; Video signal processing;

fLanguage

English

Publisher

ieee

Conference_Titel

Multimedia and Expo, 2006 IEEE International Conference on

Conference_Location

Toronto, Ont.

Print_ISBN

1-4244-0366-7

Electronic_ISBN

1-4244-0367-7

Type

conf

DOI

10.1109/ICME.2006.262908

Filename

4036977