DocumentCode :
268144
Title :
Classification and Ranking Approaches to Discriminative Language Modeling for ASR
Author :
Dikici, Erinç ; Semerci, Murat ; Saraçlar, Murat ; Alpaydın, Ethem
Author_Institution :
Dept. of Electr. & Electron. Eng., Bogazici Univ., Istanbul, Turkey
Volume :
21
Issue :
2
fYear :
2013
fDate :
Feb. 2013
Firstpage :
291
Lastpage :
300
Abstract :
Discriminative language modeling (DLM) is a feature-based approach that is used as an error-correcting step after hypothesis generation in automatic speech recognition (ASR). We formulate this both as a classification and a ranking problem and employ the perceptron, the margin infused relaxed algorithm (MIRA) and the support vector machine (SVM). To decrease training complexity, we try count-based thresholding for feature selection and data sampling from the list of hypotheses. On a Turkish morphology based feature set we examine the use of first and higher order n -grams and present an extensive analysis on the complexity and accuracy of the models with an emphasis on statistical significance. We find that we can save significantly from computation by feature selection and data sampling, without significant loss in accuracy. Using the MIRA or SVM does not lead to any further improvement over the perceptron but the use of ranking as opposed to classification leads to a 0.4% reduction in word error rate (WER) which is statistically significant.
Keywords :
error correction; higher order statistics; perceptrons; speech recognition; support vector machines; ASR; DLM; MIRA; SVM; Turkish morphology; WER; automatic speech recognition; classification problem; count-based thresholding; data sampling; discriminative language modeling; error-correcting step; feature selection; higher order n-grams; hypothesis generation; margin infused relaxed algorithm; perceptron; ranking problem; support vector machine; training complexity; word error rate; Accuracy; Complexity theory; Error analysis; Prototypes; Support vector machines; Training; Vectors; Discriminative language modeling (DLM); data sampling; feature selection; language modeling; margin infused relaxed algorithm (MIRA); ranking MIRA; ranking perceptron; ranking support vector machine (SVM); speech recognition;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2012.2221461
Filename :
6317141
Link To Document :
بازگشت