Title :
A submodular optimization approach to sentence set selection
Author_Institution :
Corp. R&D Center, Toshiba Corp., Kawasaki, Japan
Abstract :
A new method for selecting a sentence set with a desired phoneme distribution is presented. Selection of a sentence set for speech corpus recording is a fundamental step in speech processing research. The problem of designing phonetically-balanced sentence sets has been studied extensively in the past. One of the popular approaches is to select a sentence set so that its phoneme distribution gets close to a given (desired) distribution. Several methods have been proposed in the literature to realize this approach. However, these methods were designed by heuristics, which means they are not optimal. In this paper, we propose a near-optimal method for selecting sentence sets along this approach. We first define our objective function, and show it to be a submodular function. Then, we show that a greedy algorithm is near-optimal for this problem, according to the submodular optimization theory. We also show that a significant speedup is possible by exploiting the submodularity of the objective function. Our experimental result on Japanese phonetically-balanced sentence set selection shows the effectiveness of the proposed method.
Keywords :
greedy algorithms; optimisation; speech processing; speech recognition; greedy algorithm; phoneme distribution; sentence set selection; speech corpus recording; speech processing; submodular optimization approach; Buildings; Greedy algorithms; Linear programming; Optimization; Speech; Speech processing; Speech recognition; Corpus design; Kullback-Leibler divergence; phoneme distribution; speech recognition and synthesis; submodular optimization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854375