DocumentCode :
639779
Title :
Factored language model adaptation using Dirichlet class language model for speech recognition
Author :
Hatami, Ali ; Akbari, A. ; Nasersharif, Babak
Author_Institution :
Comput. Eng. Dept., IUST, Tehran, Iran
fYear :
2013
fDate :
28-30 May 2013
Firstpage :
438
Lastpage :
442
Abstract :
Language model (LM) is essential for speech recognition systems. Efficiency of this model depends on its adaptation to the linguistic characteristics. According to this, adaptation methods attempt to use syntactic and semantic features for language modelling. The previous adaptation methods such as family of Dirichlet class language model (DCLM) exploit class of history words. These methods due to lake of syntactic information are not appropriate for high morphology languages such as Farsi. This paper presents an overview for using syntactic information such as part-of-speech (POS) in DCLM for combining with a factored language model (FLM). In our proposed idea, word clustering is based on POS of previous words and history words. Different LMs are experimentally evaluated using the BijanKhan corpus. The experiments indicate that use of POS information along with history words and class of history words improves FLM, and reduces the perplexity on our corpus. Moreover, LMs are evaluated using the Farsdat corpus in hidden Markov model based on automatic speech recognition (ASR) system. Exploiting POS information along with DCLM achieved relative gain of word error rate of the ASR system by 1.2% over the DCLM.
Keywords :
hidden Markov models; natural language processing; pattern clustering; speech recognition; text analysis; ASR system; BijanKhan corpus; DCLM; Dirichlet class language model; FLM; Farsdat corpus; LM; POS information; adaptation methods; automatic speech recognition system; corpus perplexity; factored language model adaptation; hidden Markov model; history words; linguistic characteristics; part-of-speech; relative gain; semantic features; syntactic features; syntactic information; word clustering; word error rate; Acoustics; Adaptation models; Computational modeling; Hidden Markov models; History; Speech; Speech recognition; factored language model; part-of-speech; perplexity; speech recognition; word error rate;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Knowledge Technology (IKT), 2013 5th Conference on
Conference_Location :
Shiraz
Print_ISBN :
978-1-4673-6489-8
Type :
conf
DOI :
10.1109/IKT.2013.6620107
Filename :
6620107
Link To Document :
بازگشت