DocumentCode
1668454
Title
Using The Voice Spectrum For Improved Tracking Of People In A Joint Audio-Video Scheme
Author
D´Arca, Eleonora ; Robertson, Neil M. ; Hopgood, James
Author_Institution
Joint Res. Inst. for Signal & Image Process., Heriot-Watt Univ., Edinburgh, UK
fYear
2013
Firstpage
3622
Lastpage
3626
Abstract
In this paper we present a new solution to the problem of speaker tracking among people where occlusions occur (disappearance and non-speaking). In a normal conversation between two or more people, we learn speaker mel-cepstral coefficients (MFCC) and incorporate this information into a sequential Bayesian audio-video position tracker. The joint video-to-audio data association step is thus improved and we achieve robust person recognition which in turn aids tracking performance. We provide comprehensive evaluation via simulations and real data quoting tracking accuracy, precision and diarisation error rate (DER) compared to ground truth. For simulate and real experiments in an open space the trajectory tracking performance increases by 20% measured against ground truth using our approach. As a further enhancement versus the state-of-the-art, speaker identity recognition at a distance is improved by 20% by exploiting audio-video localisation cues.
Keywords
audio signal processing; speaker recognition; video signal processing; aids tracking performance; audio-video localisation cues; data quoting tracking accuracy; diarisation error rate; joint audio-video scheme; mel-cepstral coefficients; sequential Bayesian audio-video position tracker; speaker identity recognition; speaker tracking; trajectory tracking performance; video-to-audio data association step; voice spectrum; Accuracy; Cameras; Density estimation robust algorithm; Speaker recognition; Speech; Target tracking; Trajectory; Distant Speaker Recognition; EKF; MFCC; Multimodal tracking; Speaker Tracking;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location
Vancouver, BC
ISSN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2013.6638333
Filename
6638333
Link To Document