DocumentCode
2980346
Title
Decision-tree based feature-space quantization for fast Gaussian computation
Author
Padmanabhan, M. ; Jan, E.E. ; Bahl, L.R. ; Picheny, M.
Author_Institution
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
fYear
1997
fDate
14-17 Dec 1997
Firstpage
325
Lastpage
330
Abstract
Presents a decision-tree based procedure to quantize the feature-space of a speech recognizer, with the motivation of reducing the computation time required for evaluating Gaussians in a speech recognition system. The entire feature space is quantized into non-overlapping regions, where each region is bounded by a number of hyperplanes. Each region is characterized by the occurrence of only a small number of the total alphabet of allophones (sub-phonetic speech units). By identifying the region in which a test feature vector lies, only the Gaussians that model the density of allophones that exist in that region need be evaluated. The quantization of the feature space is done in a hierarchical manner using a binary decision tree. Each node of the decision tree represents a region of the feature space, and is further characterized by a hyperplane (a vector v_n and a scalar threshold value hn), that subdivides the region corresponding to the current node into two non-overlapping regions corresponding to the two children of the current node. Given a test feature vector, the process of finding the region that it lies in involves traversing this binary decision tree, which is computationally inexpensive. We present results of experiments that show that the Gaussian computation time can be reduced by as much as a factor of 20 with negligible degradation in accuracy. We also examine issues of robustness to different environments
Keywords
Gaussian distribution; quantisation (signal); speech recognition; trees (mathematics); accuracy; allophone alphabet; allophone density modelling; binary decision tree; computation time; fast Gaussian computation; feature space regions; feature-space quantization; hyperplanes; nonoverlapping regions; robustness; scalar threshold value; speech recognition system; subphonetic speech units; test feature vector; Decision trees; Degradation; Error analysis; Robustness; Speech analysis; Speech recognition; Testing; Vector quantization;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
Conference_Location
Santa Barbara, CA
Print_ISBN
0-7803-3698-4
Type
conf
DOI
10.1109/ASRU.1997.659107
Filename
659107
Link To Document