مرکز منطقه ای اطلاع رساني علوم و فناوري - Decision combination in speech metadata extraction

DocumentCode :

409671

Title :

Decision combination in speech metadata extraction

Author :

Lin, Xiaofan

Author_Institution :

Hewlett-Packard Labs., Palo Alto, CA, USA

Volume :

fYear :

2003

fDate :

9-12 Nov. 2003

Firstpage :

560

Abstract :

Speech metadata extraction can both improve speech recognition and enable novel interactive voice response applications. Unlike the previous research, which concentrates on the frame-level signal processing and pattern classification, this paper systematically studies the behavior of decision combination at the utterance level. We analyze the asymptotic characteristics, and the factors affecting frame-level classification. In addition, we introduce new methods to more accurately and efficiently combine frame-level decisions, including phoneme/power-based weighting and smart sampling. Experimental results in gender classification are presented.

Keywords :

decision theory; meta data; speech recognition; decision combination; frame-level classification; gender classification; interactive voice response; smart sampling; speech metadata extraction; speech recognition; Automatic speech recognition; Data mining; Laboratories; Loudspeakers; Mel frequency cepstral coefficient; Pattern classification; Signal processing; Signal processing algorithms; Speech processing; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signals, Systems and Computers, 2004. Conference Record of the Thirty-Seventh Asilomar Conference on

Print_ISBN :

0-7803-8104-1

Type :

conf

DOI :

10.1109/ACSSC.2003.1291973

Filename :

1291973

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=409671