Title :
Combined articulatory and auditory processing for improved speech recognition
Author :
Huang, Guangpu ; Er, Meng Joo
Author_Institution :
Comput. Vision Lab., Nanyang Technol. Univ., Singapore, Singapore
Abstract :
In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. Then we reported the performance of the combined auditory and articulatory processing methods in the inversion and recognition experiments. Firstly, at the front end, the auditory based bark-frequency cepstral coefficient (BFCC) obtained equivalent or higher accuracy compared to the mel-frequency cepstral coefficient (MFCC). Secondly, the use of APFs also significantly altered the phoneme error patterns compared to the purely acoustic features, and they displayed advantages over the canonical pseudo-articulatory features (PAFs) which are manually derived from the phonological rules. The observations support our view that the combinational use of auditory and articulatory cues is beneficial for speech pattern classification. And the proposed neural based API model qualifies as a competitive candidate for profound phoneme recognition with salient features such as generality and portability.
Keywords :
pattern classification; speech recognition; APF; API; BFCC; MFCC; articulatory phonetic features; articulatory phonetic inversion; articulatory processing; auditory processing; auditory qualities; bark frequency cepstral coefficient; mel-frequency cepstral coefficient; speech pattern classification; speech recognition; Accuracy; Hidden Markov models; Humans; Mel frequency cepstral coefficient; Speech; Speech recognition;
Conference_Titel :
Industrial Electronics and Applications (ICIEA), 2012 7th IEEE Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4577-2118-2
DOI :
10.1109/ICIEA.2012.6360864