• DocumentCode
    794805
  • Title

    Unsupervised speaker recognition based on competition between self-organizing maps

  • Author

    Lapidot, Itshak ; Guterman, Hugo ; Cohen, Arnon

  • Author_Institution
    Dept. of Software Eng., Negev Acad. Coll. of Eng., Beer-Sheva, Israel
  • Volume
    13
  • Issue
    4
  • fYear
    2002
  • fDate
    7/1/2002 12:00:00 AM
  • Firstpage
    877
  • Lastpage
    887
  • Abstract
    We present a method for clustering the speakers from unlabeled and unsegmented conversation (with known number of speakers), when no a priori knowledge about the identity of the participants is given. Each speaker was modeled by a self-organizing map (SOM). The SOMs were randomly initiated. An iterative algorithm allows the data move from one model to another and adjust the SOMs. The restriction that the data can move only in small groups but not by moving each and every feature vector separately force the SOMs to adjust to speakers (instead of phonemes or other vocal events). This method was applied to high-quality conversations with two to five participants and to two-speaker telephone-quality conversations. The results for two (both high- and telephone-quality) and three speakers were over 80% correct segmentation. The problem becomes even harder when the number of participants is also unknown. Based on the iterative clustering algorithm a validity criterion was also developed to estimate the number of speakers. In 16 out of 17 conversations of high-quality conversations between two and three participants, the estimation of the number of the participants was correct. In telephone-quality the results were poorer.
  • Keywords
    pattern clustering; self-organising feature maps; speaker recognition; unsupervised learning; SOM; iterative algorithm; self-organizing map competition; speaker clustering; unlabeled unsegmented conversation; unsupervised speaker recognition; Bandwidth; Clustering algorithms; Clustering methods; Computer security; Iterative algorithms; Self organizing feature maps; Speaker recognition; Speech; Training data; Vector quantization;
  • fLanguage
    English
  • Journal_Title
    Neural Networks, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9227
  • Type

    jour

  • DOI
    10.1109/TNN.2002.1021888
  • Filename
    1021888