• DocumentCode
    3162387
  • Title

    Speech overlap detection and attribution using convolutive non-negative sparse coding

  • Author

    Vipperla, Ravichander ; Geiger, Jürgen T. ; Bozonnet, Simon ; Wang, Dong ; Evans, Nicholas ; Schuller, Björn ; Rigoll, Gerhard

  • Author_Institution
    Multimedia Commun. Dept., Eurecom, Sophia Antipolis, France
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    4181
  • Lastpage
    4184
  • Abstract
    Overlapping speech is known to degrade speaker diarization performance with impacts on speaker clustering and segmentation. While previous work made important advances in detecting overlapping speech intervals and in attributing them to relevant speakers, the problem remains largely unsolved. This paper reports the first application of convolutive non-negative sparse coding (CNSC) to the overlap problem. CNSC aims to decompose a composite signal into its underlying contributory parts and is thus naturally suited to overlap detection and attribution. Experimental results on NIST RT data show that the CNSC approach gives comparable results to a state-of-the-art hidden Markov model based overlap detector. In a practical diarization system, CNSC based speaker attribution is shown to reduce the speaker error by over 40% relative in overlapping segments.
  • Keywords
    encoding; speaker recognition; CNSC approach; NIST RT data; composite signal; convolutive non negative sparse coding; hidden Markov model; speaker clustering; speaker diarization performance; speaker segmentation; speech overlap detection; Density estimation robust algorithm; Encoding; Error analysis; Hidden Markov models; Matrix decomposition; Sparse matrices; Speech; convolutive non-negative sparse coding; overlap detection; speaker attribution; speaker diarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6288840
  • Filename
    6288840