Title :
Visual voice activity detection via chaos based lip motion measure robust under illumination changes
Author :
Taeyup Song ; Kyungsun Lee ; Hanseok Ko
Author_Institution :
Dept. of Biomicrosyst. Eng., Korea Univ., Seoul, South Korea
Abstract :
In this paper, a vision based voice activity detection (VVAD) algorithm is proposed using chaos theory. In conventional VVAD algorithm, the movement measure of lip region is found by applying an optical flow algorithm to detect the visual speech frame using a motion based energy feature set. However, since motion based feature is unstable under illumination changes, a new form of robust feature set is desirable. It is propositioned that contextual changes such as lip opening or closing motion during speech utterances under illumination variation can be observed as chaos-like and the resultant complex fractal trajectories in phase space can be observed. The fractality is measured in phase space from two sequential video input frames and subsequently any visual speech frames are robustly detected. Representative experiments are performed in image sequence containing a driver scene undergoing illumination fluctuations in moving vehicle environment. Experimental results indicate that a substantial improvement is obtained in terms of achieving significantly lower false alarm rate over the conventional method.
Keywords :
chaos; driver information systems; feature extraction; fractals; human computer interaction; image motion analysis; image sequences; lighting; object detection; speech recognition; video signal processing; VVAD algorithm; automatic speech recognition; chaos based lip motion measure; chaos theory; contextual changes; driver scene; human machine interaction; human vehicle interaction; illumination changes; illumination fluctuations; illumination variation; image sequence; lip closing motion; lip opening motion; lip region movement measure; mobile platforms; motion based energy feature set; moving vehicle environment; optical flow algorithm; phase space; resultant complex fractal trajectories; robust feature set; sequential video input frames; speech utterances; vehicular platforms; visual speech frame detect; visual voice activity detection algorithm; Chaos; Fractals; Lighting; Motion measurement; Speech; Trajectory; Visualization; Chaos inspired motion feature; voice activity detection;
Journal_Title :
Consumer Electronics, IEEE Transactions on
DOI :
10.1109/TCE.2014.6852001