• DocumentCode
    457312
  • Title

    Audio Segmentation and Speaker Localization in Meeting Videos

  • Author

    Vajaria, Himanshu ; Islam, Tanmoy ; Sarkar, Sudeep ; Sankar, Ravi ; Kasturi, Ranga

  • Author_Institution
    Dept. of Comput. Sci. & Eng., South Florida Univ., Tampa, FL
  • Volume
    2
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    1150
  • Lastpage
    1153
  • Abstract
    Segmenting different individuals in a group meeting and their speech is an important first step for various tasks such as meeting transcription, automatic camera panning, multimedia retrieval and monologue detection. In this effort, given a meeting room video, we attempt to segment individual person´s speech and localize them in the video, based on data from a single audio and video source. The segmentation method is driven by audio and enhanced by video cues. We used Bayesian information criterion (BIC) to segment the feature vector streams and graph spectral partitioning to cluster them. We compare our results with audio based segmentation method and our localization technique with the commonly used mutual information
  • Keywords
    Bayes methods; audio signal processing; graph theory; speech processing; vectors; video signal processing; Bayesian information criterion; audio segmentation; automatic camera panning; feature vector streams; graph spectral partitioning; meeting transcription; monologue detection; multimedia retrieval; speaker localization; speech segmention; Cameras; Face detection; Image segmentation; Microphones; Mutual information; NIST; Robustness; Speech; Streaming media; Videos;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2006. ICPR 2006. 18th International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-2521-0
  • Type

    conf

  • DOI
    10.1109/ICPR.2006.283
  • Filename
    1699413