DocumentCode :
3405780
Title :
Topic regression multi-modal Latent Dirichlet Allocation for image annotation
Author :
Putthividhy, Duangmanee ; Attias, Hagai T. ; Nagarajan, Srikantan S.
Author_Institution :
UCSD, La Jolla, CA, USA
fYear :
2010
fDate :
13-18 June 2010
Firstpage :
3408
Lastpage :
3415
Abstract :
We present topic-regression multi-modal Latent Dirich-let Allocation (tr-mmLDA), a novel statistical topic model for the task of image and video annotation. At the heart of our new annotation model lies a novel latent variable regression approach to capture correlations between image or video features and annotation texts. Instead of sharing a set of latent topics between the 2 data modalities as in the formulation of correspondence LDA in, our approach introduces a regression module to correlate the 2 sets of topics, which captures more general forms of association and allows the number of topics in the 2 data modalities to be different. We demonstrate the power of tr-mmLDA on 2 standard annotation datasets: a 5000-image subset of COREL and a 2687-image LabelMe dataset. The proposed association model shows improved performance over correspondence LDA as measured by caption perplexity.
Keywords :
image retrieval; regression analysis; video retrieval; COREL; LDA; LabelMe dataset; caption perplexity; image annotation; multimodal latent dirichlet allocation; statistical topic model; topic regression; video annotation; Content based retrieval; Image databases; Image retrieval; Inference algorithms; Information retrieval; Linear discriminant analysis; Multimedia databases; Multimedia systems; Predictive models; Video sharing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
Conference_Location :
San Francisco, CA
ISSN :
1063-6919
Print_ISBN :
978-1-4244-6984-0
Type :
conf
DOI :
10.1109/CVPR.2010.5540000
Filename :
5540000
Link To Document :
بازگشت