Multimedia Content Segmentation Based on Speaker Recognition

Author

Babu, Jasine ; Pathari, Vinod

Author_Institution

Motorola India Pvt. Ltd., Bangalore

fYear

2007

fDate

22-24 Feb. 2007

Firstpage

16

Lastpage

19

Abstract

Many recent works attempt to index multimedia data based on characteristics such as speaker identity and emotional content. In this work, speaker segmentation is performed on movies to extract the shots in which the target actor is speaking. A case of speaker identification on conversational speech under noisy conditions-this work is organized into two phases; an audio classification phase, for the removal of non-speech content, followed by a speaker recognition phase. Along with the speaker models, Gaussian mixture models are constructed for sound effects like fight sequences and drum beats to refine the removal of non-speech sounds. Results prove the effectiveness of this deviation from the conventional methods

Keywords

Gaussian processes; audio signal processing; multimedia communication; speaker recognition; Gaussian mixture model; multimedia content segmentation; speaker recognition; Acoustic noise; Data mining; Face detection; Indexing; Information retrieval; Loudspeakers; Motion pictures; Multimedia databases; Speaker recognition; Speech processing;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing, Communications and Networking, 2007. ICSCN '07. International Conference on

Conference_Location

Chennai

Print_ISBN

1-4244-0997-7

Electronic_ISBN

1-4244-0997-7

Type

conf

DOI

10.1109/ICSCN.2007.350672

Filename

4156575