Comparative analysis of hidden Markov models for multi-modal dialogue scene indexing

Author

Alatan, A. Aydzn ; Akansu, Ali N. ; Wolf, Wayne

Author_Institution

Center for Multimedia Res., New Jersey Inst. of Technol., Newark, NJ, USA

Volume

6

fYear

2000

fDate

2000

Firstpage

2401

Abstract

A class of audio-visual content is segmented into dialogue scenes using the state transitions of a novel hidden Markov model (HMM). Each shot is classified using both the audio track and the visual content to determine the state/scene transitions of the model. After simulations with circular and left-to-right HMM topologies, it is observed that both performing very well with multi-modal inputs. Moreover, for the circular topology, the comparisons between different training and observation sets show that audio and face information together gives the most consistent results among different observation sets

Keywords

audio signal processing; audio-visual systems; database indexing; hidden Markov models; image classification; image segmentation; multimedia databases; topology; video databases; video signal processing; audio information; audio track; audio-visual content segmentation; circular topology; face information; hidden Markov models; left-to-right topology; multi-modal dialogue scene indexing; multi-modal inputs; observation sets; scene transitions; simulations; state transitions; training sets; video shot classification; Computer networks; Data mining; Electronic mail; Hidden Markov models; Image analysis; Indexing; Layout; Motion pictures; Network topology; Production;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on

Conference_Location

Istanbul

ISSN

1520-6149

Print_ISBN

0-7803-6293-4

Type

conf

DOI

10.1109/ICASSP.2000.859325

Filename

859325