• DocumentCode
    1060167
  • Title

    Efficient Speaker Change Detection Using Adapted Gaussian Mixture Models

  • Author

    Malegaonkar, Amit S. ; Ariyaeeinia, Aladdin M. ; Sivakumaran, Perasiriyan

  • Author_Institution
    Trinity Convergence India Pvt. Ltd., Pune
  • Volume
    15
  • Issue
    6
  • fYear
    2007
  • Firstpage
    1859
  • Lastpage
    1869
  • Abstract
    A new approach to speaker change detection is proposed and investigated. The method, which is based on a probabilistic framework, provides an effective means for tackling the problem posed by phonetic variation in high-resolution speaker change detection. Additionally, the approach incorporates the capability for dealing with undesired effects of variations in speech characteristics. Using the experimental investigations conduced with clean and broadcast news audio, it is shown that the proposed method is significantly more effective than the currently popular techniques for speaker change detection. To enhance the computational efficiency of the proposed method, modified implementation algorithms are introduced which are based on the exploitation of the redundant operations and a fast scoring procedure. It is shown that, through the use of the proposed fast algorithm, the computational efficiency of the approach can be increased by over 77% without significant reduction in its accuracy. The paper discusses the principles and characteristics of the proposed speaker change detection method, and provides a detailed description of its efficient implementation. The experiments, investigating the performance of the proposed method and its effectiveness in relation to other approaches, are described and an analysis of the results is presented.
  • Keywords
    Gaussian processes; speaker recognition; Gaussian mixture models; computational efficiency; phonetic variation; speaker change detection; Acoustic signal detection; Broadcasting; Change detection algorithms; Computational efficiency; Indexing; Loudspeakers; Performance analysis; Speech recognition; Streaming media; Testing; Bilateral scoring; phonetic heterogeneity; probabilistic approach;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2007.896665
  • Filename
    4276758