مرکز منطقه ای اطلاع رساني علوم و فناوري - Single-channel multi-talker-localization based on maximum likelihood

DocumentCode :

1655512

Title :

Single-channel multi-talker-localization based on maximum likelihood

Author :

Takashima, Ryoichi ; Takiguchi, Tetsuya ; Ariki, Yasuo

Author_Institution :

Grad. Sch. of Eng., Kobe Univ., Kobe, Japan

fYear :

2009

Firstpage :

461

Lastpage :

464

Abstract :

This paper presents a sound source (talker) localization method using only a single microphone based upon maximum likelihood. In our previous work, we proposed GMM (Gaussian mixture model) separation for estimation of the sound source direction, where the observed (reverberant) speech is separated into the acoustic transfer function and the clean speech GMM, and showed its effectiveness for the single-talker localization task. In this paper, we discuss a multi-talker localization method using GMM separation and model composition. Model composition is used to represent speech signals observed in a reverberant environment corresponding to every conceivable combination of positions of the sound sources, where composite models are obtained through composition of talker´s speech model and acoustic transfer functions estimated using GMM separation. For each test data set, we find a maximum-likelihood model from among the composite models corresponding to each combination of talkers´ positions. The effectiveness of this method has been confirmed by two-talker localization experiments performed in a room environment.

Keywords :

Gaussian processes; acoustic signal processing; direction-of-arrival estimation; maximum likelihood estimation; microphones; reverberation; signal representation; source separation; speech recognition; transfer functions; Gaussian mixture model; acoustic transfer function; clean speech GMM separation; maximum-likelihood estimation model; reverberant room environment; single microphone; single-channel multitalker-localization; single-talker localization; sound source direction estimation; sound source localization method; speech signal representation; talker speech model composition; two-talker localization experiment; Acoustic testing; Acoustical engineering; Cepstral analysis; Maximum likelihood estimation; Microphone arrays; Multiple signal classification; Phased arrays; Speech; Training data; Transfer functions; acoustic transfer function; maximum likelihood; model composition; single channel; talker localization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Statistical Signal Processing, 2009. SSP '09. IEEE/SP 15th Workshop on

Conference_Location :

Cardiff

Print_ISBN :

978-1-4244-2709-3

Electronic_ISBN :

978-1-4244-2711-6

Type :

conf

DOI :

10.1109/SSP.2009.5278540

Filename :

5278540

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1655512