DocumentCode
2409616
Title
Multimedia Content Segmentation Based on Speaker Recognition
Author
Babu, Jasine ; Pathari, Vinod
Author_Institution
Motorola India Pvt. Ltd., Bangalore
fYear
2007
fDate
22-24 Feb. 2007
Firstpage
16
Lastpage
19
Abstract
Many recent works attempt to index multimedia data based on characteristics such as speaker identity and emotional content. In this work, speaker segmentation is performed on movies to extract the shots in which the target actor is speaking. A case of speaker identification on conversational speech under noisy conditions-this work is organized into two phases; an audio classification phase, for the removal of non-speech content, followed by a speaker recognition phase. Along with the speaker models, Gaussian mixture models are constructed for sound effects like fight sequences and drum beats to refine the removal of non-speech sounds. Results prove the effectiveness of this deviation from the conventional methods
Keywords
Gaussian processes; audio signal processing; multimedia communication; speaker recognition; Gaussian mixture model; multimedia content segmentation; speaker recognition; Acoustic noise; Data mining; Face detection; Indexing; Information retrieval; Loudspeakers; Motion pictures; Multimedia databases; Speaker recognition; Speech processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing, Communications and Networking, 2007. ICSCN '07. International Conference on
Conference_Location
Chennai
Print_ISBN
1-4244-0997-7
Electronic_ISBN
1-4244-0997-7
Type
conf
DOI
10.1109/ICSCN.2007.350672
Filename
4156575
Link To Document