• DocumentCode
    2234677
  • Title

    A real-time lip localization and tacking for lip reading

  • Author

    WenJuan, Yao ; YaLing, Liang ; Minghui, Du

  • Author_Institution
    Sch. of Electron. & Inf. Eng., South China Univ. of Technol., Guangzhou, China
  • Volume
    6
  • fYear
    2010
  • fDate
    20-22 Aug. 2010
  • Abstract
    Most automatic speech recognition systems have concentrated exclusively on the acoustic speech signal, and therefore they are susceptible to acoustic noise. The benefits from visual speech cues have motivated significant interest in automatic lip-reading, which aims at improving automatic speech recognition by exploiting informative visual features of a speaker´s mouth region, which means speaker lip motion stands out as the most linguistically visual feature. In this paper, we present a new improved robust lip location and tracking approach, aims at improving the lip-reading accuracy. Lip regions of interest are detected by a new method, combining with Intel Open source (OpenCV). In this new method, we analyze the distribution relationship between faces, eyes and mouth, and then the mouth region can be easily located. It can be proved as an effective method for lip tracking. In the subsequent step, color space is transferred to Lab from RGB color space, and a component of Lab color space is used for extracting lip segmentation and tracking lip region more accurately and efficiently from video sequences of a speaker´s talking face in different lighting conditions, and with different lip shapes and head poses. Extensive experiments show that our proposed method can achieve superior performance to other similar lip tracking approaches, and then can be effectively integrated in lip-reading or visual speech recognition systems.
  • Keywords
    image colour analysis; image recognition; image segmentation; image sequences; speech recognition; Intel Open source; Lab color space; OpenCV; RGB color space; automatic speech recognition systems; lip reading; lip segmentation; lip tacking; real-time lip localization; speaker lip motion; video sequences; visual speech recognition systems; Color; Lighting; Lips; Mouth; Visualization; OpenCV; a component; lip tracking;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on
  • Conference_Location
    Chengdu
  • ISSN
    2154-7491
  • Print_ISBN
    978-1-4244-6539-2
  • Type

    conf

  • DOI
    10.1109/ICACTE.2010.5579830
  • Filename
    5579830