• DocumentCode
    3363945
  • Title

    Robust visual features for the multimodal identification of unregistered speakers in TV talk-shows

  • Author

    Vallet, Félicien ; Essid, Slim ; Carrive, Jean ; Richard, Gaël

  • Author_Institution
    CNRS/LTCI, Telecom ParisTech, Paris, France
  • fYear
    2010
  • fDate
    26-29 Sept. 2010
  • Firstpage
    1469
  • Lastpage
    1472
  • Abstract
    In this paper we propose a novel multimodal method for identifying unregistered speakers in a TV talk-show using a semi-supervised learning approach based on Support Vector Machines. Our study highlights the fact that specific visual features prove to be very efficient for this particular type of video content which is edited from multi-camera recordings. These visual features, motivated by prior knowledge on the approach followed by the TV director in choosing the appropriate shots, are found to bring a significant improvement in identification accuracy when used together with classic audio Mel-frequency cepstral coefficients (+8% compared to various baseline systems, in particular a standard audio only system).
  • Keywords
    feature extraction; learning (artificial intelligence); multimedia systems; speaker recognition; support vector machines; television; video cameras; video recording; video signal processing; TV director; TV talk-show; audio Mel-frequency cepstral coefficient; multicamera recording; multimedia system; multimodal identification; semisupervised learning; support vector machine; unregistered speaker identification; video content; visual feature; Face; Feature extraction; Image color analysis; Robustness; Speech; Support vector machines; Visualization; image analysis; multimedia databases; multimedia systems; pattern classification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2010 17th IEEE International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1522-4880
  • Print_ISBN
    978-1-4244-7992-4
  • Electronic_ISBN
    1522-4880
  • Type

    conf

  • DOI
    10.1109/ICIP.2010.5653393
  • Filename
    5653393