مرکز منطقه ای اطلاع رساني علوم و فناوري - Factored language model adaptation using Dirichlet class language model for speech recognition

DocumentCode :

639779

Title :

Factored language model adaptation using Dirichlet class language model for speech recognition

Author :

Hatami, Ali ; Akbari, A. ; Nasersharif, Babak

Author_Institution :

Comput. Eng. Dept., IUST, Tehran, Iran

fYear :

2013

fDate :

28-30 May 2013

Firstpage :

438

Lastpage :

442

Abstract :

Language model (LM) is essential for speech recognition systems. Efficiency of this model depends on its adaptation to the linguistic characteristics. According to this, adaptation methods attempt to use syntactic and semantic features for language modelling. The previous adaptation methods such as family of Dirichlet class language model (DCLM) exploit class of history words. These methods due to lake of syntactic information are not appropriate for high morphology languages such as Farsi. This paper presents an overview for using syntactic information such as part-of-speech (POS) in DCLM for combining with a factored language model (FLM). In our proposed idea, word clustering is based on POS of previous words and history words. Different LMs are experimentally evaluated using the BijanKhan corpus. The experiments indicate that use of POS information along with history words and class of history words improves FLM, and reduces the perplexity on our corpus. Moreover, LMs are evaluated using the Farsdat corpus in hidden Markov model based on automatic speech recognition (ASR) system. Exploiting POS information along with DCLM achieved relative gain of word error rate of the ASR system by 1.2% over the DCLM.

Keywords :

hidden Markov models; natural language processing; pattern clustering; speech recognition; text analysis; ASR system; BijanKhan corpus; DCLM; Dirichlet class language model; FLM; Farsdat corpus; LM; POS information; adaptation methods; automatic speech recognition system; corpus perplexity; factored language model adaptation; hidden Markov model; history words; linguistic characteristics; part-of-speech; relative gain; semantic features; syntactic features; syntactic information; word clustering; word error rate; Acoustics; Adaptation models; Computational modeling; Hidden Markov models; History; Speech; Speech recognition; factored language model; part-of-speech; perplexity; speech recognition; word error rate;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Information and Knowledge Technology (IKT), 2013 5th Conference on

Conference_Location :

Shiraz

Print_ISBN :

978-1-4673-6489-8

Type :

conf

DOI :

10.1109/IKT.2013.6620107

Filename :

6620107

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=639779