مرکز منطقه ای اطلاع رساني علوم و فناوري - Inferring the Structure of a Tennis Game Using Audio Information

DocumentCode :

1415912

Title :

Inferring the Structure of a Tennis Game Using Audio Information

Author :

Huang, Qiang ; Cox, Stephen

Author_Institution :

Sch. of Comput. Sci., Univ. of East Anglia, Norwich, UK

Volume :

Issue :

fYear :

2011

Firstpage :

1925

Lastpage :

1937

Abstract :

We describe a novel framework for inferring the low-level structure of a sports game (tennis) using only the information available on the audio track of a video recording of the game. Our goal is to segment the games into a sequence of points, the natural unit for describing a tennis match. The framework is hierarchical, consisting of, at the lowest level, identification of audio events, followed by “match” (i.e., semantic) events and at the highest level, game points. Different techniques that are appropriate to the characteristics of each of these events are used to detect them and these techniques are coupled in a probabilistic framework. The techniques consist of Gaussian mixture models and a hierarchical language model to detect sequences of audio events, a maximum entropy Markov model to infer “match” events from these audio events and multigrams to infer the segmentation of a sequence of match events into sequences of points in a a tennis game. Our results are promising, giving an F-score for the final detection of points of >; 0.7.

Keywords :

Gaussian processes; Markov processes; audio signal processing; computer games; probability; sport; video recording; Gaussian mixture models; audio event sequences detection; audio information; game points; hierarchical language model; maximum entropy Markov model; probabilistic framework; tennis game structure; video recording; Event detection; Feature extraction; Games; Hidden Markov models; Semantics; Speech processing; Visualization; Audio characterization; categorization; classification; hierarchical language model; maximum entropy Markov model; multigram model;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2010.2103059

Filename :

5677600

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1415912