• DocumentCode
    2882622
  • Title

    Speech and music classification in audio documents

  • Author

    Pinquier, Julien ; Senac, Christine

  • Author_Institution
    Régine André-Obrecht, IRIT, France
  • Volume
    4
  • fYear
    2002
  • fDate
    13-17 May 2002
  • Abstract
    To index efficiently the soundtrack of multimedia documents, it is necessary to extract elementary and homogeneous acoustic segments. In this paper, we explore such a prior partitioning which consists in detect the two basic components, which are speech and music components. The originality of this work is that music and speech are not considered as two classes and two classification systems are independently defined, a speech/non-speech one and a music/non-music one. This approach permits to better characterize and discriminate each component: in particular, two different feature spaces are necessary as two pairs of Gaussian mixture models. More, the acoustic signal is divided into four types of segments: speech, music, speech-music and other. The experiments are performed on the soundtracks of audio video documents (films, TV sport broadcasts). The performance proves the interest of this approach, so called the Differentiated Modeling Approach.
  • Keywords
    Colored noise; Speech; Speech enhancement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
  • Conference_Location
    Orlando, FL, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2002.5745593
  • Filename
    5745593