DocumentCode :
2990740
Title :
Speaker sampling for enhanced diversity
Author :
Bernstein, Jared ; Kahn, Margaret ; Poza, Tito
Author_Institution :
SRI International, Menlo Park, CA
Volume :
10
fYear :
1985
fDate :
31138
Firstpage :
1553
Lastpage :
1556
Abstract :
Assembling a speech data base that is both manageably small and sufficiently diverse can be a useful step in the development of speaker independent speech recognition systems. Yet there has been no data on what kind of speaker sample might be required to ensure a group whose speech includes certain phonetic or linguistic traits. The data gathered in this study suggests that some common and important dialect features will not be found even in a large number of speakers, if sampling is conducted at a single location. In order to compile a large pool of prospective speakers, 152 people were recorded for about one or two minutes speaking extemporaneously; the recordings were then rated by the three authors according to fifteen characteristics that form three classes: voice quality, manner of speaking, and dialect. Although a wide variety of voice characteristics and manners of speaking were evident among the 152 speakers, the dialect features covered a limited range. We discuss the possible causes of this distribution of characteristics in the sample and some of its implications for collecting adequate databases for speech recognition research.
Keywords :
Acoustic noise; Cities and towns; Ear; Frequency; Low-frequency noise; Noise shaping; Pulse shaping methods; Sampling methods; Spectral shape; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '85.
Type :
conf
DOI :
10.1109/ICASSP.1985.1168174
Filename :
1168174
Link To Document :
بازگشت