Efficient Speaker Change Detection Using Adapted Gaussian Mixture Models

Author

Malegaonkar, Amit S. ; Ariyaeeinia, Aladdin M. ; Sivakumaran, Perasiriyan

Author_Institution

Trinity Convergence India Pvt. Ltd., Pune

Volume

15

Issue

6

fYear

2007

Firstpage

1859

Lastpage

1869

Abstract

A new approach to speaker change detection is proposed and investigated. The method, which is based on a probabilistic framework, provides an effective means for tackling the problem posed by phonetic variation in high-resolution speaker change detection. Additionally, the approach incorporates the capability for dealing with undesired effects of variations in speech characteristics. Using the experimental investigations conduced with clean and broadcast news audio, it is shown that the proposed method is significantly more effective than the currently popular techniques for speaker change detection. To enhance the computational efficiency of the proposed method, modified implementation algorithms are introduced which are based on the exploitation of the redundant operations and a fast scoring procedure. It is shown that, through the use of the proposed fast algorithm, the computational efficiency of the approach can be increased by over 77% without significant reduction in its accuracy. The paper discusses the principles and characteristics of the proposed speaker change detection method, and provides a detailed description of its efficient implementation. The experiments, investigating the performance of the proposed method and its effectiveness in relation to other approaches, are described and an analysis of the results is presented.

Keywords

Gaussian processes; speaker recognition; Gaussian mixture models; computational efficiency; phonetic variation; speaker change detection; Acoustic signal detection; Broadcasting; Change detection algorithms; Computational efficiency; Indexing; Loudspeakers; Performance analysis; Speech recognition; Streaming media; Testing; Bilateral scoring; phonetic heterogeneity; probabilistic approach;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TASL.2007.896665

Filename

4276758