• DocumentCode
    2854223
  • Title

    Language identification using Gaussian mixture model tokenization

  • Author

    Torres-Carrasquillo, Pedro A. ; Reynolds, Douglas A. ; Deller, J.R., Jr.

  • Author_Institution
    Department of Electrical Engineering, Michigan State University, East Lansing, USA
  • Volume
    1
  • fYear
    2002
  • fDate
    13-17 May 2002
  • Abstract
    Phone tokenization followed by n-gram language modeling has consistently provided good results for the task of language identification. In this paper, this technique is generalized by using Gaussian mixture models as the basis for tokenizing. Performance results are presented for a system employing a GMM tokenizer in conjunction with multiple language processing and score combination techniques. On the 1996 CallFriend LID evaluation set, a 12-way closed set error rate of 17% was obtained.
  • Keywords
    Acoustics; Argon; Computational modeling; Encoding; Feature extraction; Speech; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
  • Conference_Location
    Orlando, FL, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2002.5743828
  • Filename
    5743828