• DocumentCode
    2713208
  • Title

    A comparison of features for speech, music discrimination

  • Author

    Carey, Michael J. ; Parris, Eluned S. ; Lloyd-Thomas, Harvey

  • Author_Institution
    Ensigma Ltd., Chepstow, UK
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    149
  • Abstract
    Several approaches have previously been taken to the problem of discriminating between speech and music signals. These have used different features as the input to the classifier and have tested and trained on different material. In this paper we examine the discrimination achieved by several different features using common training and test sets and the same classifier. The database assembled for these tests includes speech from thirteen languages and music from all over the world. In each case the distributions in the feature space were modelled by a Gaussian mixture model. Experiments were carried out on four types of feature, amplitude, cepstra, pitch and zero-crossings. In each case the derivative of the feature was also used and found to improve performance. The best performance resulted from using the cepstra and delta cepstra which gave an equal error rate (EER) of 1.28. This was closely followed by normalised amplitude and delta amplitude. This however used a much less complex model. The pitch and delta pitch gave an EER of 4% which was better than the zero-crossing which produced an EER of 6%
  • Keywords
    Gaussian processes; audio signal processing; cepstral analysis; error statistics; music; signal classification; speech recognition; EER; Gaussian mixture model; amplitude; cepstra; classifier; delta amplitude; delta cepstra; equal error rate; feature; music discrimination; normalised amplitude; pitch; speech discrimination; zero-crossings; Amplitude estimation; Assembly; Bandwidth; Cepstral analysis; Frequency estimation; Materials testing; Multiple signal classification; Natural languages; Spatial databases; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758084
  • Filename
    758084