Author :
Nayak, Sunita ; Sarkar, Sudeep ; Loeding, Barbara
Abstract :
The common practice in sign language recognition is to ?rst construct individual sign models, in terms of discrete state transitions, mostly represented using Hidden Markov Models, from manually isolated sign samples and then to use it to recognize signs in continuous sentences. In this paper we (i) propose a continuous state space model, where the states are based on purely image-based features, without the use of special gloves, and (ii) present an unsupervised approach to both extract and learn models for continuous basic units of signs, which we term as signemes, from continuous sentences. Given a set of sentences with a common sign, we can automatically learn the model for part of the sign, or signeme, that is least affected by coarticulation effects. While there are coarticulation effects in speech recognition, these effects are even stronger in sign language. The model itself is in term of traces in a space of Relational Distributions. Each point in this space represents a Relational Distribution, capturing the spatial relationships between low-level features, such as edge points. We perform speed normalization and then incrementally extract the common sign between sentences, or signemes, with a dynamic programming framework at the core to compute warped distance between two subsentences. We test our idea using the publicly available Boston SignStream Dataset by building signeme models of 18 signs. We test the quality of the models by considering how well we can localize the sign in a new sentence. We also present preliminary results for the ability to generalize across signers.