DocumentCode
3232005
Title
Effect of articulatory Δ and ΔΔ parameters on multilayer neural network based speech recognition
Author
Banik, Manoj ; Kotwal, Mohammed Rokibul Alam ; Hassan, Foyzul ; Islam, Gazi Md Moshfiqul ; Rahman, Sharif Mohammad Musfiqur ; Hasan, Mohammad Mahedi ; Muhammad, Ghulam ; Mohammad, Nurul Huda
Author_Institution
Dept. of CSE, Ahsanullah Univ. of Sci. & Technol., Dhaka, Bangladesh
fYear
2010
fDate
6-9 Dec. 2010
Firstpage
624
Lastpage
627
Abstract
This paper describes an effect of articulatory dynamic parameters (Δ and ΔΔ) on neural network based automatic speech recognition(ASR). Articulatory features (AFs) or distinctive phonetic features (DPFs)-based system shows its superiority in performances over acoustic features- based in ASR. These performances can be further improved by incorporating articulatory dynamic parameters into it. In this paper, we have proposed such a phoneme recognition system that comprises three stages: (i) DPFs extraction using a multilayer neural network (MLN) from acoustic features, (ii) incorporation of dynamic parameters into another MLN for reducing DPF context, and (iii) addition of an Inhibition/Enhancement (In/En) network for categorizing the DPF movement more accurately and Gram-Schmidt (GS) orthogonalization procedure for decorrelating the inhibited/enhanced data vector before connecting with hidden Markov model (HMMs)-based classifier. From the experiments on Japanese Newspaper Article Sentences (JNAS), it is observed that the proposed method provides a higher phoneme correct rate over the method that does not incorporate dynamic articulatory parameters. Moreover, it reduces mixture components in HMM for obtaining a higher recognition performance.
Keywords
hidden Markov models; neural nets; speech recognition; ΔΔ parameter; DPF context; Gram-Schmidt orthogonalization procedure; acoustic feature; articulatory Δ parameter; articulatory dynamic parameter; automatic speech recognition; distinctive phonetic features; hidden Markov model based classifier; multilayer neural network; phoneme recognition system; phonetic feature; Acoustics; Artificial neural networks; Context; Feature extraction; Hidden Markov models; Speech; Speech recognition; Distinctive Phonetic Features; Dynamic Parameters; Hidden Markov Models; Local Features; Multi-Layer Neural Network;
fLanguage
English
Publisher
ieee
Conference_Titel
Circuits and Systems (APCCAS), 2010 IEEE Asia Pacific Conference on
Conference_Location
Kuala Lumpur
Print_ISBN
978-1-4244-7454-7
Type
conf
DOI
10.1109/APCCAS.2010.5775027
Filename
5775027
Link To Document