• DocumentCode
    3748580
  • Title

    LEWIS: Latent Embeddings for Word Images and Their Semantics

  • Author

    Albert Gordo; Almaz?n;Naila Murray;Florent Perronin

  • fYear
    2015
  • Firstpage
    1242
  • Lastpage
    1250
  • Abstract
    The goal of this work is to bring semantics into the tasks of text recognition and retrieval in natural images. Although text recognition and retrieval have received a lot of attention in recent years, previous works have focused on recognizing or retrieving exactly the same word used as a query, without taking the semantics into consideration. In this paper, we ask the following question: can we predict semantic concepts directly from a word image, without explicitly trying to transcribe the word image or its characters at any point? For this goal we propose a convolutional neural network (CNN) with a weighted ranking loss objective that ensures that the concepts relevant to the query image are ranked ahead of those that are not relevant. This can also be interpreted as learning a Euclidean space where word images and concepts are jointly embedded. This model is learned in an end-to-end manner, from image pixels to semantic concepts, using a dataset of synthetically generated word images and concepts mined from a lexical database (WordNet). Our results show that, despite the complexity of the task, word images and concepts can indeed be associated with a high degree of accuracy.
  • Keywords
    "Semantics","Image recognition","Text recognition","Computer vision","Image representation","Neural networks","Databases"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision (ICCV), 2015 IEEE International Conference on
  • Electronic_ISBN
    2380-7504
  • Type

    conf

  • DOI
    10.1109/ICCV.2015.147
  • Filename
    7410504