• DocumentCode
    417720
  • Title

    Combining multiple representations on the TRECVID search task [video retrieval system]

  • Author

    De Vries, Arjen P. ; Westerveld, Thijs ; Ianeva, Tzveta Ianeva

  • Author_Institution
    CWI, Amsterdam, Netherlands
  • Volume
    3
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    This paper presents a (preliminary) analysis of the evaluation results obtained on the TRECVID 2003 search task. We study in particular the effects of combining multiple representations on retrieval: multiple representations of video content (speech and visual) and of the user information need (multiple visual examples). We conclude from our multi-modal retrieval experiments the following working hypothesis: even though the automatic speech recognition run is usually better than the visual run, matching against both modalities ensures robustness against choosing the wrong content representation. For the same reason, using multiple visual examples to represent the user information needs is preferable over using a single designated example only.
  • Keywords
    image recognition; image representation; information retrieval; speech recognition; video signal processing; ASR; TRECVID search task; automatic speech recognition; multimodal retrieval; multiple representations combining; speech content; user information need; video content; video retrieval; video retrieval system; visual content; Automatic speech recognition; Content based retrieval; Covariance matrix; Discrete cosine transforms; Gaussian processes; Image retrieval; Information retrieval; Merging; Pixel; Robustness;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326729
  • Filename
    1326729