Title :
Semantic language models for Automatic Speech Recognition
Author :
Bayer, Ali Orkan ; Riccardi, Giuseppe
Author_Institution :
Signals & Interactive Syst. Lab., Univ. of Trento, Trento, Italy
Abstract :
We are interested in the problem of semantics-aware training of language models (LMs) for Automatic Speech Recognition (ASR). Traditional language modeling research have ignored semantic constraints and focused on limited size histories of words. Semantic structures may provide information to capture lexically realized long-range dependencies as well as the linguistic scene of a speech utterance. In this paper, we present a novel semantic LM(SELM) that is based on the theory of frame semantics. Frame semantics analyzes meaning of words by considering their role in the semantic frames they occur and by considering their syntactic properties. We show that by integrating semantic frames and target words into recurrent neural network LMs we can gain significant improvements in perplexity and word error rates. We have evaluated the semantic LM on the publicly available ASR baselines on the Wall Street Journal (WSJ) corpus. SELMs achieve 50% and 64% relative reduction in perplexity compared to n-gram models by using frames and target words respectively. In addition, 12% and 7% relative improvements in word error rates are achieved by SELMs on the Nov´92 and Nov´93 test sets with respect to the baseline tri-gram LM.
Keywords :
computational linguistics; recurrent neural nets; speech recognition; Wall Street Journal corpus; automatic speech recognition; frame semantics; n-gram models; recurrent neural network; semantic constraints; semantic frames; semantic language models; semantic structures; semantics-aware training; speech utterance; syntactic properties; word error rates; Abstracts; Business; Data models; Semantics; Frame Semantics; Language Modeling; Recurrent Neural Networks; Semantic Language Models;
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
DOI :
10.1109/SLT.2014.7078541