• DocumentCode
    3422767
  • Title

    Language recognition with discriminative keyword selection

  • Author

    Richardson, F.S. ; Campbell, W.M.

  • Author_Institution
    Lincoln Lab., MIT, Cambridge, MA
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    4145
  • Lastpage
    4148
  • Abstract
    One commonly used approach for language recognition is to convert the input speech into a sequence of tokens such as words or phones and then to use these token sequences to determine the target language. The language classification is typically performed by extracting N-gram statistics from the token sequences and then using an N-gram language model or support vector machine (SVM) to perform the classification. One problem with these approaches is that the number of N-grams grows exponentially as the order N is increased. This is especially problematic for an SVM classifier as each utterance is represented as a distinct N-gram vector. In this paper we propose a novel approach for modeling higher order N-grams using an SVM via an alternating filter-wrapper feature selection method. We demonstrate the effectiveness of this technique on the NIST 2007 language recognition task.
  • Keywords
    natural language processing; pattern classification; speaker recognition; support vector machines; N-gram language model; N-gram statistics; N-gram vector; alternating filter-wrapper feature selection method; discriminative keyword selection; language classification; language recognition; support vector machine; token sequences; Hidden Markov models; Laboratories; Lattices; Natural languages; Power system modeling; Speech recognition; Statistics; Support vector machine classification; Support vector machines; Target recognition; Language Recognition; Support Vector Machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4518567
  • Filename
    4518567