DocumentCode
1796928
Title
Leveraging phonetic context dependent invariant structure for continuous speech recognition
Author
Congying Zhang ; Suzuki, M. ; Kurata, Gakuto ; Nishimura, M. ; Minematsu, Nobuaki
Author_Institution
Univ. of Tokyo, Tokyo, Japan
fYear
2014
fDate
9-13 July 2014
Firstpage
52
Lastpage
56
Abstract
Speech acoustics intrinsically vary due to linguistic and non-linguistic factors. The invariant structure extracted from a given utterance is one of the long-span acoustic representations, where acoustic variation caused by non-linguistic factors can be removed reasonably. It expresses spectral contrasts between acoustic events in an utterance. In previous studies, the invariant structure was leveraged in continuous speech recognition for reranking the N-best candidates hypothesized by a traditional automatic speech recognition (ASR) system. Use of the invariant structure features for reranking showed good effects, however, the features were defined or labeled in a phonetic-context-independent way. In this paper, use of phonetic context to define invariant structure features is examined. The proposed method is tested in two tasks of continuous digits speech recognition and large vocabulary continuous speech recognition (LVCSR). The performances are improved relatively by 4.7% and 1.2%, respectively.
Keywords
speech recognition; continuous digits speech recognition; large vocabulary continuous speech recognition; long-span acoustic representations; phonetic context dependent invariant structure; spectral contrasts; speech acoustics; utterance; Abstracts; Hidden Markov models; Indexes; Robustness; Speech; Testing; Continuous digits speech recognition; Invariant structure; LVCSR; N-best candidates reranking; Phonetic context;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal and Information Processing (ChinaSIP), 2014 IEEE China Summit & International Conference on
Conference_Location
Xi´an
Print_ISBN
978-1-4799-5401-8
Type
conf
DOI
10.1109/ChinaSIP.2014.6889200
Filename
6889200
Link To Document