مرکز منطقه ای اطلاع رساني علوم و فناوري - Topic-Dependent-Class-Based <formula formulatype="inline"> <img src="/images/tex/388.gif" alt="n"> </formula>-Gram Language Model

DocumentCode :

1419922

Title :

Topic-Dependent-Class-Based $n$ -Gram Language Model

Author :

Naptali, Welly ; Tsuchiya, Masatoshi ; Nakagawa, Seiichi

Author_Institution :

Acad. Center for Comput. & Media Studies, Kyoto Univ., Kyoto, Japan

Volume :

Issue :

fYear :

2012

fDate :

7/1/2012 12:00:00 AM

Firstpage :

1513

Lastpage :

1525

Abstract :

A topic-dependent-class (TDC)-based n-gram language model (LM) is a topic-based LM that employs a semantic extraction method to reveal latent topic information extracted from noun-noun relations. A topic of a given word sequence is decided on the basis of most frequently occuring (weighted) noun classes in the context history through voting. Our previous work (W. Naptali, M. Tsuchiya, and S. Seiichi, “Topic-dependent language model with voting on noun history,”ACM Trans. Asian Language Information Processing (TALIP), vol. 9, no. 2, pp. 1-31, 2010) has shown that in terms of perplexity, TDCs outperform several state-of-the-art baselines, i.e., a word-based or class-based n-gram LM and their interpolation, a cache-based LM, an n-gram-based topic-dependent LM, and a Latent Dirichlet Allocation (LDA)-based topic-dependent LM. This study is a follow up of our previous work and there are three key differences. First, we improve TDCs by employing soft-clustering and/or soft-voting techniques, which solve data shrinking problems and make TDCs independent of the word-based n-gram, in the training and/or test phases. Second, for further improvement, we incorporate a cache-based LM through unigram scaling, because the TDC and cache-based LM capture different properties of the language. Finally, we provide an evaluation in terms of the word error rate (WER) and an analysis of the automatic speech recognition (ASR) rescoring task. Experiments performed on the Wall Street Journal and the Mainichi Shimbun (a Japanese newspaper) demonstrate that the TDC LM improves both perplexity and the WER. The perplexity reduction is up to 25.1% relative on the English corpus and 25.7% relative on the Japanese corpus. Furthermore, the greatest reduction in the WER is 15.2% relative to the English ASR and 24.3 relative to the Japanese ASR, as compared to the baseline.

Keywords :

natural language processing; speech recognition; English corpus; Japanese corpus; automatic speech recognition rescoring task; context history; data shrinking problem; latent dirichlet allocation; latent topic information; noun-noun relations; perplexity reduction; semantic extraction method; soft clustering; soft voting; topic-dependent-class-based n-gram language model; unigram scaling; word error rate; word sequence; word-based n-gram; Context; History; Matrix decomposition; Semantics; Speech; Training; Vectors; $n$-gram; Language model; perplexity; speech recognition; topic dependent;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2012.2183870

Filename :

6129394

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1419922