Title :
Long short-term memory language models with additive morphological features for automatic speech recognition
Author :
Renshaw, Daniel ; Hall, Keith B.
Author_Institution :
Univ. of Edinburgh, Edinburgh, UK
Abstract :
Models of morphologically rich languages suffer from data sparsity when words are treated as atomic units. Word-based language models cannot transfer knowledge from common word forms to rarer variant forms. Learning a continuous vector representation of each morpheme allows a compositional model to represent a word as the sum of its constituent morphemes´ vectors. Rare and unknown words containing common morphemes can thus be represented with greater fidelity despite their sparsity. Our novel neural network language model integrates this additive morphological representation into a long short-term memory architecture, improving Russian speech recognition word error rates by 0.9 absolute, 4.4% relative, compared to a robust n-gram baseline model.
Keywords :
natural language processing; speech recognition; Russian speech recognition word error rates; additive morphological features; additive morphological representation; atomic units; automatic speech recognition; continuous vector representation; data sparsity; long short term memory language models; novel neural network language model; robust n-gram baseline model; transfer knowledge; word based language models; Artificial neural networks; Computational modeling; Mathematical model; Training; Training data; Vocabulary; compositional morphology; language modeling; long short-term memory; neural networks;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178972