• DocumentCode
    1723639
  • Title

    Unsupervised Generation of Context-Relevant Training-Sets for Visual Object Recognition Employing Multilinguality

  • Author

    Schoeler, Markus ; Worgotter, Florentin ; Kulvicius, Tomas ; Papon, Jeremie

  • Author_Institution
    III. Phys. Inst. - Biophys., Georg-August Univ. of Gottingen, Gottingen, Germany
  • fYear
    2015
  • Firstpage
    805
  • Lastpage
    812
  • Abstract
    Image based object classification requires clean training data sets. Gathering such sets is usually done manually by humans, which is time-consuming and laborious. On the other hand, directly using images from search engines creates very noisy data due to ambiguous noun-focused indexing. However, in daily speech nouns and verbs are always coupled. We use this for the automatic generation of clean data sets by the here-presented TRANSCLEAN algorithm, which through the use of multiple languages also solves the problem of polyesters (a single spelling with multiple meanings). Thus, we use the implicit knowledge contained in verbs, e.g. in an imperative such as "hit the nail", implicating a metal nail and not the fingernail. One type of reference application where this method can automatically operate is human-robot collaboration based on discourse. A second is the generation of clean image data sets, where tedious manual cleaning can be replaced by the much simpler manual generation of a single relevant verb-noun tuple. Here we show the impact of our improved training sets for several widely used and state-of-the-art classifiers including Multipath Hierarchical Matching Pursuit. All tested classifiers show a substantial boost of about +20% in recognition performance.
  • Keywords
    image classification; indexing; iterative methods; natural language processing; object recognition; unsupervised learning; TRANSCLEAN algorithm; automatic clean data set generation; clean image data sets; fingernail; hit the nail; human-robot collaboration; image based object classification; metal nail; multilinguality; multipath hierarchical matching pursuit; noun-focused indexing; polysemes; search engines; unsupervised context-relevant training-set generation; verb-noun tuple; visual object recognition; Clutter; Context; Fasteners; Google; Nails; Search engines; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on
  • Conference_Location
    Waikoloa, HI
  • Type

    conf

  • DOI
    10.1109/WACV.2015.112
  • Filename
    7045966