DocumentCode :
1798867
Title :
Plant identification with noisy web data
Author :
Zhang, William Y. ; Xian-Sheng Hua
fYear :
2014
fDate :
14-18 July 2014
Firstpage :
1
Lastpage :
6
Abstract :
One of the main problems in image based plant identification has been the lack of quality training image data. A few attempts for solving this problem through generating high quality plant images from crowd sourced Web image collections like Flickr are proposed in this paper. These methods try to automatically identify correct and informative training images from those Web images, which typically have very noisy metadata (for example, user tags in Flickr), to enhance existing manually labeled training set. Firstly, for each plant, a set of images is collected from searching Flickr by using the plant name as the query. Then, images are clustered into visually consistent clusters, and in each cluster hopefully a majority of the images are all relevant or irrelevant to the particular plant. From these clusters, a managed plant image data set from ImageCLEF is used as reference to automatically select the highest quality cluster for each plant. The image quality of the selected clusters is further improved by two algorithms: an iterative method and image similarity based ranking. We show that the larger training data set automatically selected by this method significantly increases the accuracy of image based plant identification. In addition, this approach is a generic solution to almost all image recognition problems as long as additional (noisy) training data can be obtained from the Internet automatically.
Keywords :
Internet; agricultural engineering; image retrieval; iterative methods; object detection; object recognition; pattern clustering; social networking (online); Flickr; ImageCLEF; Internet; Web image collection; automatic informative training image identification; image based plant identification; image clustering; image quality; image recognition problem; image similarity based ranking; iterative method; labeled training set; noisy Web data; query processing; visually consistent cluster; Accuracy; Clustering algorithms; Data models; Noise measurement; Support vector machines; Training; Training data; Image classification; crowd sourced big data; machine learning; plant identification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia and Expo (ICME), 2014 IEEE International Conference on
Conference_Location :
Chengdu
Type :
conf
DOI :
10.1109/ICME.2014.6890180
Filename :
6890180
Link To Document :
بازگشت