مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning Feature Hierarchies: A Layer-Wise Tag-Embedded Approach

DocumentCode :

4612

Title :

Learning Feature Hierarchies: A Layer-Wise Tag-Embedded Approach

Author :

Zhaoquan Yuan ; Changsheng Xu ; Jitao Sang ; Shuicheng Yan ; Hossain, M. Shamim

Author_Institution :

Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China

Volume :

Issue :

fYear :

2015

fDate :

Jun-15

Firstpage :

816

Lastpage :

827

Abstract :

Feature representation learning is an important and fundamental task in multimedia and pattern recognition research. In this paper, we propose a novel framework to explore the hierarchical structure inside the images from the perspective of feature representation learning, which is applied to hierarchical image annotation. Different from the current trend in multimedia analysis of using pre-defined features or focusing on the end-task “flat” representation, we propose a novel layer-wise tag- embedded deep learning (LTDL) model to learn hierarchical features which correspond to hierarchical semantic structures in the tag hierarchy . Unlike most existing deep learning models, LTDL utilizes both the visual content of the image and the hierarchical information of associated social tags. In the training stage, the two kinds of information are fused in a bottom-up way. Supervised training and multi-modal fusion alternate in a layer-wise way to learn feature hierarchies. To validate the effectiveness of LTDL, we conduct extensive experiments for hierarchical image annotation on a large-scale public dataset. Experimental results show that the proposed LTDL can learn representative features with improved performances.

Keywords :

feature extraction; image fusion; image representation; learning (artificial intelligence); multimedia communication; LTDL; end task flat representation; feature representation; hierarchical image annotation; hierarchical semantic structure; information fusion; layerwise tag embedded approach; learning feature hierarchy; multimedia analysis; multimodal fusion; social tags; supervised training; tag hierarchy; visual content; Abstracts; Learning systems; Multimedia communication; Principal component analysis; Semantics; Training; Visualization; Auto-encoder; deep learning; hierarchical feature learning; social tags;

fLanguage :

English

Journal_Title :

Multimedia, IEEE Transactions on

Publisher :

ieee

ISSN :

1520-9210

Type :

jour

DOI :

10.1109/TMM.2015.2417777

Filename :

7070707

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=4612