DocumentCode
1492683
Title
Online Diarization of Streaming Audio-Visual Data for Smart Environments
Author
Schmalenstroeer, Joerg ; Haeb-Umbach, Reinhold
Author_Institution
Dept. of Commun. Eng., Univ. of Paderborn, Paderborn, Germany
Volume
4
Issue
5
fYear
2010
Firstpage
845
Lastpage
856
Abstract
For an environment to be perceived as being smart, contextual information has to be gathered to adapt the system´s behavior and its interface towards the user. Being a rich source of context information speech can be acquired unobtrusively by microphone arrays and then processed to extract information about the user and his environment. In this paper, a system for joint temporal segmentation, speaker localization, and identification is presented, which is supported by face identification from video data obtained from a steerable camera. Special attention is paid to latency aspects and online processing capabilities, as they are important for the application under investigation, namely ambient communication. It describes the vision of terminal-less, session-less and multi-modal telecommunication with remote partners, where the user can move freely within his home while the communication follows him. The speaker diarization serves as a context source, which has been integrated in a service-oriented middleware architecture and provided to the application to select the most appropriate I/O device and to steer the camera towards the speaker during ambient communication.
Keywords
audio streaming; face recognition; image segmentation; middleware; software architecture; speaker recognition; telecommunication computing; video streaming; audio visual data streaming; context information speech; face identification; multimodal telecommunication; online diarization; service oriented middleware architecture; sessionless telecommunication; speaker identification; speaker localization; steerable camera; temporal segmentation; terminal-less telecommunication; Cameras; Context; Context-aware services; Data mining; Delay; Face detection; Microphone arrays; Middleware; Speech processing; Streaming media; ambient communication; diarization; middleware;
fLanguage
English
Journal_Title
Selected Topics in Signal Processing, IEEE Journal of
Publisher
ieee
ISSN
1932-4553
Type
jour
DOI
10.1109/JSTSP.2010.2050519
Filename
5466034
Link To Document