DocumentCode :
3748724
Title :
Learning Common Sense through Visual Abstraction
Author :
Ramakrishna Vedantam;Xiao Lin;Tanmay Batra;C. Lawrence Zitnick;Devi Parikh
Author_Institution :
Virginia Tech, Blacksburg, VA, USA
fYear :
2015
Firstpage :
2542
Lastpage :
2550
Abstract :
Common sense is essential for building intelligent machines. While some commonsense knowledge is explicitly stated in human-generated text and can be learnt by mining the web, much of it is unwritten. It is often unnecessary and even unnatural to write about commonsense facts. While unwritten, this commonsense knowledge is not unseen! The visual world around us is full of structure modeled by commonsense knowledge. Can machines learn common sense simply by observing our visual world? Unfortunately, this requires automatic and accurate detection of objects, their attributes, poses, and interactions between objects, which remain challenging problems. Our key insight is that while visual common sense is depicted in visual content, it is the semantic features that are relevant and not low-level pixel information. In other words, photorealism is not necessary to learn common sense. We explore the use of human-generated abstract scenes made from clipart for learning common sense. In particular, we reason about the plausibility of an interaction or relation between a pair of nouns by measuring the similarity of the relation and nouns with other relations and nouns we have seen in abstract scenes. We show that the commonsense knowledge we learn is complementary to what can be learnt from sources of text.
Keywords :
"Visualization","Libraries","Semantics","Cognition","Data mining","Grounding","Training"
Publisher :
ieee
Conference_Titel :
Computer Vision (ICCV), 2015 IEEE International Conference on
Electronic_ISBN :
2380-7504
Type :
conf
DOI :
10.1109/ICCV.2015.292
Filename :
7410649
Link To Document :
بازگشت