• DocumentCode
    2426123
  • Title

    Information access using speech, speaker and face recognition

  • Author

    Viswanathan, M. ; Beigi, H.S.M. ; Tritschler, A. ; Maali, F.

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    1
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    493
  • Abstract
    We describe a scheme to combine the results of audio and face identification for multimedia indexing and retrieval. Audio analysis consists of speech and speaker recognition derived from a broadcast news video clip. The video component is analyzed to identify the persons in the same video clip using face recognition. When applied individually both speaker and face recognition schemes have limitations on conditions under which they perform reasonably well. By integrating the match-score results of both audio and video analysis, we find that the two techniques can complement each other. We discuss the system architecture for such a combined system, and discuss how decision fusion is applied to disparate match-scoring systems to yield the final speaker identity
  • Keywords
    content-based retrieval; database indexing; face recognition; multimedia databases; speaker recognition; audio analysis; broadcast news video clip; decision fusion; face recognition; information access; match-scoring systems; multimedia indexing; multimedia retrieval; speaker recognition; speech recognition; system architecture; Buffer storage; Digital multimedia broadcasting; Engines; Face recognition; Indexing; Multimedia communication; Signal processing algorithms; Speaker recognition; Speech analysis; Streaming media;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on
  • Conference_Location
    New York, NY
  • Print_ISBN
    0-7803-6536-4
  • Type

    conf

  • DOI
    10.1109/ICME.2000.869646
  • Filename
    869646