Learning unified sparse representations for multi-modal data

Author

Kaiye Wang; Wei Wang; Liang Wang

Author_Institution

Center for Res. on Intell. Perception &

fYear

2015

Firstpage

3545

Lastpage

3549

Abstract

Cross-modal retrieval has become one of interesting and important research problem recently, where users can take one modality of data (e.g., text, image or video) as the query to retrieve relevant data of another modality. In this paper, we present a Multi-modal Unified Representation Learning (MURL) algorithm for cross-modal retrieval, which learns unified sparse representations for multi-modal data representing the same semantics via joint dictionary learning. The ℓ₁-norm is imposed on the unified representations to explicitly encourage sparsity, which makes our algorithm more robust. Furthermore, a constraint regularization term is imposed to force the representations to be similar if their corresponding multi-modal data have must-links or to be far apart if their corresponding multi-modal data have cannot-links. An iterative algorithm is also proposed to solve the objective function. The effectiveness of the proposed method is verified by extensive results on two real-world datasets.

Keywords

"Dictionaries","Electronic publishing","Internet","Optimization","Semantics","Linear programming"

Publisher

ieee

Conference_Titel

Image Processing (ICIP), 2015 IEEE International Conference on

Type

conf

DOI

10.1109/ICIP.2015.7351464

Filename

7351464

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3707873