ROI processing for visual features extraction in lip-reading

Author

Wang, Xiaoping ; Hao, Yufeng ; Fu, Degang ; Yuan, Chunwei

Author_Institution

State Key Lab. of Bioelectron., Southeast Univ., Nanjing

fYear

2008

fDate

7-11 June 2008

Firstpage

178

Lastpage

181

Abstract

Region of interest (ROI) is the key basis of visual features extraction in lip-reading process. In this paper, we discussed the ROI processing method and explored its impact on recognition accuracy with the comparison of four kinds of processed ROIs obtained by using four basic image processing methods: gray-scale normalization, difference enhancement, edge enhancement and image segmentation. Then recognition tasks for speaker-independent lip-reading were carried out by the aid of continuous hidden Markov model (CHMM). The experimental results show that for discrete cosine transform (DCT) based features, normalized gray-scale image can achieve the best recognition performance among these four ROIs.

Keywords

discrete cosine transforms; edge detection; feature extraction; hidden Markov models; image segmentation; continuous hidden Markov model; difference enhancement; discrete cosine transform; edge enhancement; gray-scale normalization; image processing; image segmentation; region of interest; speaker-independent lip-reading; visual features extraction; Active shape model; Automatic speech recognition; Chromium; Discrete cosine transforms; Face detection; Feature extraction; Hidden Markov models; Image processing; Image recognition; Skin; Lip-reading; ROI; Visual Features Extraction;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Networks and Signal Processing, 2008 International Conference on

Conference_Location

Nanjing

Print_ISBN

978-1-4244-2310-1

Electronic_ISBN

978-1-4244-2311-8

Type

conf

DOI

10.1109/ICNNSP.2008.4590335

Filename

4590335