Title :
Lip reading system using novel Japanese visemes classification and hierarchical weighted discrimination
Author :
Okita, Shinsuke ; Mitsukura, Yasue ; Hamada, Nozomu
Author_Institution :
Dept. of Syst. Design Eng., Keio Univ., Yokohama, Japan
Abstract :
In recent years, automatic lip reading based on `visemes´ have been studied by researchers for realizing human-machine interactive communication system in many applications. However there are a lot of problems such as the definition of the number of viseme classes, discrimination method of visemes, speech recognition method based on visemes, and so on. In this paper, a novel classification of Japanese visemes and hierarchical weighted discrimination method for speech recognition are proposed to address these problems. We augmented the classification number of visemes from 6(conventional) to 9 to represent the words in more detailed by visemes. In addition, considering the difficulty in discriminating with increase of the number of visemes, the hierarchical weighted discrimination method is proposed. For the purpose of comparing with the conventional method, the ATR phonetically balanced word group, which is large vocabulary and includes various visemes, was used and applied to word recognition experiments. From these results, we confirmed the proposed method worked well.
Keywords :
image recognition; natural language processing; speech recognition; Japanese visemes classification; automatic lip reading; hierarchical weighted discrimination method; human-machine interactive communication system; speech recognition method; visemes discrimination method; word recognition experiments; Feature extraction; Hidden Markov models; Mouth; Proposals; Speech recognition; Visualization; Image processing; Pattern recognition; lip reading; visemes mouth-shape code; visual speech recognition;
Conference_Titel :
Intelligent Signal Processing and Communications Systems (ISPACS), 2013 International Symposium on
Conference_Location :
Naha
Print_ISBN :
978-1-4673-6360-0
DOI :
10.1109/ISPACS.2013.6704608