DocumentCode
3422767
Title
Language recognition with discriminative keyword selection
Author
Richardson, F.S. ; Campbell, W.M.
Author_Institution
Lincoln Lab., MIT, Cambridge, MA
fYear
2008
fDate
March 31 2008-April 4 2008
Firstpage
4145
Lastpage
4148
Abstract
One commonly used approach for language recognition is to convert the input speech into a sequence of tokens such as words or phones and then to use these token sequences to determine the target language. The language classification is typically performed by extracting N-gram statistics from the token sequences and then using an N-gram language model or support vector machine (SVM) to perform the classification. One problem with these approaches is that the number of N-grams grows exponentially as the order N is increased. This is especially problematic for an SVM classifier as each utterance is represented as a distinct N-gram vector. In this paper we propose a novel approach for modeling higher order N-grams using an SVM via an alternating filter-wrapper feature selection method. We demonstrate the effectiveness of this technique on the NIST 2007 language recognition task.
Keywords
natural language processing; pattern classification; speaker recognition; support vector machines; N-gram language model; N-gram statistics; N-gram vector; alternating filter-wrapper feature selection method; discriminative keyword selection; language classification; language recognition; support vector machine; token sequences; Hidden Markov models; Laboratories; Lattices; Natural languages; Power system modeling; Speech recognition; Statistics; Support vector machine classification; Support vector machines; Target recognition; Language Recognition; Support Vector Machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location
Las Vegas, NV
ISSN
1520-6149
Print_ISBN
978-1-4244-1483-3
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2008.4518567
Filename
4518567
Link To Document