Development of speechreading supplements based on automatic speech recognition

Author

Duchnowski, Paul ; Lum, David S. ; Krause, Jean C. ; Sexton, Matthew G. ; Bratakos, Maroula S. ; Braida, Louis D.

Author_Institution

Res. Lab. of Electron., MIT, Cambridge, MA, USA

Volume

47

Issue

4

fYear

2000

fDate

4/1/2000 12:00:00 AM

Firstpage

487

Lastpage

496

Abstract

In manual-cued speech (MCS) a speaker produces hand gestures to resolve ambiguities among speech elements that are often confused by speechreaders. The shape of the hand distinguishes among consonants; the position of the hand relative to the face distinguishes among vowels. Experienced receivers of MCS achieve nearly perfect reception of everyday connected speech. MCS has been taught to very young deaf children and greatly facilitates language learning, communication, and general education. This manuscript describes a system that can produce a form of cued speech automatically in real time and reports on its evaluation by trained receivers of MCS. Cues are derived by a hidden markov models (HMM)-based speaker-dependent phonetic speech recognizer that uses context-dependent phone models and are presented visually by superimposing animated handshapes on the face of the talker. The benefit provided by these cues strongly depends on articulation of hand movements and on precise synchronization of the actions of the hands and the face, Using the system reported here, experienced cue receivers can recognize roughly two-thirds of the keywords in cued low-context sentences correctly, compared to roughly one-third by speechreading alone (SA). The practical significance of these improvements is to support fairly normal rates of reception of conversational speech, a task that is often difficult via SA

Keywords

handicapped aids; hidden Markov models; speech recognition; ambiguities resolution; autocuer; automatic speech recognition; consonants; context-dependent phone models; cued speech; everyday connected speech; general education; hand movements articulation; hand shape; language learning; manual-cued speech; speaker-dependent phonetic speech recognizer; speechreading supplements development; transliteration; very young deaf children; Automatic speech recognition; Context modeling; Deafness; Face recognition; Facial animation; Hidden Markov models; Real time systems; Shape; Speech analysis; Speech recognition;

fLanguage

English

Journal_Title

Biomedical Engineering, IEEE Transactions on

Publisher

ieee

ISSN

0018-9294

Type

jour

DOI

10.1109/10.828148

Filename

828148