• DocumentCode
    1749723
  • Title

    Experiments on speech tracking in audio documents using Gaussian mixture modeling

  • Author

    Seck, Mouhamadou ; Magrin-Chagnolleau, Ivan ; Bimbot, Frédéric

  • Author_Institution
    IRISA, Rennes, France
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    601
  • Abstract
    This paper deals with the tracking of speech segments in audio documents. We use a cepstral-based acoustic analysis and Gaussian mixture models for the representation of the training data. Three ways of scoring an audio document based on a frame-level likelihood calculation are proposed and compared. Our experiments are done on a database composed of television programs including news reports, advertisements, and documentaries. The best equal error rate obtained is approximately 12%
  • Keywords
    Gaussian processes; acoustic signal processing; audio signal processing; cepstral analysis; signal representation; speech processing; tracking; Gaussian mixture modeling; advertisements; audio document scoring; audio documents; cepstral-based acoustic analysis; covariance matrices; database; equal error rate; frame-level likelihood; music; news reports; noise segments; smoothed log-likelihood ratio; speech segments tracking; television programs; training data representation; Cepstral analysis; Covariance matrix; Databases; Error analysis; Indexing; Smoothing methods; Speech enhancement; TV; Testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940903
  • Filename
    940903