DocumentCode :
699991
Title :
Incorporating acoustic feature diversity into the linguistic search space for syllable based speech recognition
Author :
Ramya, R. ; Hegde, Rajesh M. ; Murthy, Hema A.
Author_Institution :
Indian Inst. of Technol. Madras, Chennai, India
fYear :
2008
fDate :
25-29 Aug. 2008
Firstpage :
1
Lastpage :
5
Abstract :
Acoustic features derived from the short time magnitude and phase spectrum provide complementary information. In this paper, we discuss the significance of incorporating this diverse information into the linguistic search space for syllable based speech recognition. The diversity of group delay acoustic features computed from the phase spectrum, and MFCC computed from the magnitude spectrum, is first illustrated in a lower dimensional feature space. Motivated by this diversity of information in the acoustic feature space, we derive syllable-feature pairs. The selection of syllable-feature pairs is based on isolated syllable recognition results, computed apriori using the two acoustic feature streams. During the recognition process, based on the syllable-feature pair information likelihoods are appropriately weighted using a weighted likelihood scheme. The syllable lattice is now rescored using these weighted syllable-feature pairs in the linguistic search space. This technique of appropriately weighting the relevant acoustic feature for each syllable during the decoding process in the linguistic search space, yields reduced word error rate (WER), for experiments conducted on the TIMIT and the DBIL databases.
Keywords :
acoustic signal processing; decoding; error statistics; linguistics; maximum likelihood estimation; speech recognition; DBIL database; MFCC; TIMIT database; WER reduction; decoding process; group delay acoustic features diversity; isolated syllable recognition; linguistic search space; lower dimensional acoustic feature space; syllable based speech recognition; syllable lattice; syllable-feature pair; weighted likelihood scheme; word error rate reduction; Databases; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference, 2008 16th European
Conference_Location :
Lausanne
ISSN :
2219-5491
Type :
conf
Filename :
7080523
Link To Document :
بازگشت