DocumentCode :
1760428
Title :
Interactive Multimodal Visual Search on Mobile Device
Author :
Houqiang Li ; Yang Wang ; Tao Mei ; Jingdong Wang ; Shipeng Li
Author_Institution :
Univ. of Sci. & Technol. of China, Hefei, China
Volume :
15
Issue :
3
fYear :
2013
fDate :
41365
Firstpage :
594
Lastpage :
607
Abstract :
This paper describes a novel multimodal interactive image search system on mobile devices. The system, the Joint search with ImaGe, Speech, And Word Plus (JIGSAW+ ), takes full advantage of the multimodal input and natural user interactions of mobile devices. It is designed for users who already have pictures in their minds but have no precise descriptions or names to address them. By describing it using speech and then refining the recognized query by interactively composing a visual query using exemplary images, the user can easily find the desired images through a few natural multimodal interactions with his/her mobile device. Compared with our previous work JIGSAW, the algorithm has been significantly improved in three aspects: 1) segmentation-based image representation is adopted to remove the artificial block partitions; 2) relative position checking replaces the fixed position penalty; and 3) inverted index is constructed instead of brute force matching. The proposed JIGSAW+ is able to achieve 5% gain in terms of search performance and is ten times faster.
Keywords :
image representation; image retrieval; image segmentation; interactive systems; mobile computing; speech recognition; speech-based user interfaces; JIGSAW+; artificial block partition removal; exemplary images; interactive multimodal visual search; inverted index construction; joint search-with-image-speech-and-word plus; mobile devices; multimodal input user interactions; multimodal interactive image search system; natural multimodal interactions; natural user interactions; segmentation-based image representation; visual query; Databases; Internet; Mobile communication; Mobile handsets; Speech; Speech recognition; Visualization; Interactive search; mobile device; mobile visual search; multimodal search;
fLanguage :
English
Journal_Title :
Multimedia, IEEE Transactions on
Publisher :
ieee
ISSN :
1520-9210
Type :
jour
DOI :
10.1109/TMM.2012.2234730
Filename :
6384798
Link To Document :
بازگشت