Model-based versus knowledge-guided representation of non-rigid objects: a case study

Author

Kober, R. ; Schiffers, J. ; Schmidt, K.

Author_Institution

Res. Inst for Appl. Knowledge Process., Ulm, Germany

Volume

1

fYear

1994

fDate

13-16 Nov 1994

Firstpage

973

Abstract

Two different approaches to the detection and representation of the mouth in real video images of the human face are investigated. The model-based approach presented is based on a technique known as “deformable templates”, and tries to approximate the contours of the lips with a model consisting of four parabolas. An alternative to the model-based approach, referred to as the knowledge-guided approach, is proposed. The basic idea is not to try to capture all of the a priori knowledge about an object in a single global model that is adapted to the image, but rather to utilize the a priori knowledge in a step-by-step way, in order to refine rough initial hypotheses into a compact description of the object. This method may be interpreted as a gradual concentration on the relevant structures in the image. The combination of the resulting structures yields a compact description of the object. In the application, which is the basis for this investigation, the goal is to enhance speech recognition by using visual information about lip movements in addition to the acoustic signal. Only the problem of finding an accurate and robust representation of the lips in an image is addressed. Each of the methods were investigated for the same set of 15 faces. Our experiments indicate that the knowledge-guided approach performs more accurately and more robustly, than the model-based approach

Keywords

face recognition; image representation; knowledge based systems; knowledge representation; model-based reasoning; object detection; speech recognition; video signal processing; acoustic signal; deformable templates; experiments; human face; image representation; image structures; knowledge-guided representation; lip movements; lips contour approximation; model-based representation; mouth detection; mouth representation; non-rigid objects representation; parabolas; real video images; speech recognition; visual information; Computer aided software engineering; Eyes; Face detection; Humans; Lips; Mouth; Object detection; Robustness; Solid modeling; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Image Processing, 1994. Proceedings. ICIP-94., IEEE International Conference

Conference_Location

Austin, TX

Print_ISBN

0-8186-6952-7

Type

conf

DOI

10.1109/ICIP.1994.413254

Filename

413254