DocumentCode :
2959088
Title :
Learning cross-modality similarity for multinomial data
Author :
Jia, Yangqing ; Salzmann, Mathieu ; Darrell, Trevor
Author_Institution :
UC Berkeley EECS, Berkeley, CA, USA
fYear :
2011
fDate :
6-13 Nov. 2011
Firstpage :
2407
Lastpage :
2414
Abstract :
Many applications involve multiple-modalities such as text and images that describe the problem of interest. In order to leverage the information present in all the modalities, one must model the relationships between them. While some techniques have been proposed to tackle this problem, they either are restricted to words describing visual objects only, or require full correspondences between the different modalities. As a consequence, they are unable to tackle more realistic scenarios where a narrative text is only loosely related to an image, and where only a few image-text pairs are available. In this paper, we propose a model that addresses both these challenges. Our model can be seen as a Markov random field of topic models, which connects the documents based on their similarity. As a consequence, the topics learned with our model are shared across connected documents, thus encoding the relations between different modalities. We demonstrate the effectiveness of our model for image retrieval from a loosely related text.
Keywords :
Markov processes; document image processing; image retrieval; random processes; text analysis; Markov random field; connected document; cross modality similarity learning; image retrieval; image text pair; multinomial data; narrative text; Computational modeling; Data models; Electronic publishing; Encyclopedias; Image edge detection; Internet;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision (ICCV), 2011 IEEE International Conference on
Conference_Location :
Barcelona
ISSN :
1550-5499
Print_ISBN :
978-1-4577-1101-5
Type :
conf
DOI :
10.1109/ICCV.2011.6126524
Filename :
6126524
Link To Document :
بازگشت