• DocumentCode
    2990740
  • Title

    Speaker sampling for enhanced diversity

  • Author

    Bernstein, Jared ; Kahn, Margaret ; Poza, Tito

  • Author_Institution
    SRI International, Menlo Park, CA
  • Volume
    10
  • fYear
    1985
  • fDate
    31138
  • Firstpage
    1553
  • Lastpage
    1556
  • Abstract
    Assembling a speech data base that is both manageably small and sufficiently diverse can be a useful step in the development of speaker independent speech recognition systems. Yet there has been no data on what kind of speaker sample might be required to ensure a group whose speech includes certain phonetic or linguistic traits. The data gathered in this study suggests that some common and important dialect features will not be found even in a large number of speakers, if sampling is conducted at a single location. In order to compile a large pool of prospective speakers, 152 people were recorded for about one or two minutes speaking extemporaneously; the recordings were then rated by the three authors according to fifteen characteristics that form three classes: voice quality, manner of speaking, and dialect. Although a wide variety of voice characteristics and manners of speaking were evident among the 152 speakers, the dialect features covered a limited range. We discuss the possible causes of this distribution of characteristics in the sample and some of its implications for collecting adequate databases for speech recognition research.
  • Keywords
    Acoustic noise; Cities and towns; Ear; Frequency; Low-frequency noise; Noise shaping; Pulse shaping methods; Sampling methods; Spectral shape; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '85.
  • Type

    conf

  • DOI
    10.1109/ICASSP.1985.1168174
  • Filename
    1168174