• DocumentCode
    2409616
  • Title

    Multimedia Content Segmentation Based on Speaker Recognition

  • Author

    Babu, Jasine ; Pathari, Vinod

  • Author_Institution
    Motorola India Pvt. Ltd., Bangalore
  • fYear
    2007
  • fDate
    22-24 Feb. 2007
  • Firstpage
    16
  • Lastpage
    19
  • Abstract
    Many recent works attempt to index multimedia data based on characteristics such as speaker identity and emotional content. In this work, speaker segmentation is performed on movies to extract the shots in which the target actor is speaking. A case of speaker identification on conversational speech under noisy conditions-this work is organized into two phases; an audio classification phase, for the removal of non-speech content, followed by a speaker recognition phase. Along with the speaker models, Gaussian mixture models are constructed for sound effects like fight sequences and drum beats to refine the removal of non-speech sounds. Results prove the effectiveness of this deviation from the conventional methods
  • Keywords
    Gaussian processes; audio signal processing; multimedia communication; speaker recognition; Gaussian mixture model; multimedia content segmentation; speaker recognition; Acoustic noise; Data mining; Face detection; Indexing; Information retrieval; Loudspeakers; Motion pictures; Multimedia databases; Speaker recognition; Speech processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing, Communications and Networking, 2007. ICSCN '07. International Conference on
  • Conference_Location
    Chennai
  • Print_ISBN
    1-4244-0997-7
  • Electronic_ISBN
    1-4244-0997-7
  • Type

    conf

  • DOI
    10.1109/ICSCN.2007.350672
  • Filename
    4156575