• DocumentCode
    105473
  • Title

    A Cross-Modal Approach for Extracting Semantic Relationships Between Concepts Using Tagged Images

  • Author

    Katsurai, Makoto ; Ogawa, Tomomi ; Haseyama, Miki

  • Author_Institution
    Grad. Sch. of Inf. Sci. & Technol., Hokkaido Univ., Sapporo, Japan
  • Volume
    16
  • Issue
    4
  • fYear
    2014
  • fDate
    Jun-14
  • Firstpage
    1059
  • Lastpage
    1074
  • Abstract
    This paper presents a cross-modal approach for extracting semantic relationships between concepts using tagged images. In the proposed method, we first project both text and visual features of the tagged images to a latent space using canonical correlation analysis (CCA). Then, under the probabilistic interpretation of CCA, we calculate a representative distribution of the latent variables for each concept. Based on the representative distributions of the concepts, we derive two types of measures: the semantic relatedness between the concepts and the abstraction level of each concept. Because these measures are derived from a cross-modal scheme that enables the collaborative use of both text and visual features, the semantic relationships can successfully reflect semantic and visual contexts. Experiments conducted on tagged images collected from Flickr show that our measures are more coherent to human cognition than the conventional measures that use either text or visual features, or the WordNet-based measures. In particular, a new measure of semantic relatedness, which satisfies the triangle inequality, obtains the best results among different distance measures in our framework. The applicability of our measures to multimedia-related tasks such as concept clustering, image annotation and tag recommendation is also shown in the experiments.
  • Keywords
    Web sites; correlation methods; database management systems; feature extraction; multimedia communication; natural language processing; statistical analysis; CCA probabilistic interpretation; Flickr; WordNet-based measures; abstraction level; canonical correlation analysis; concept clustering; cross-modal scheme; distance measures; human cognition; image annotation; latent space; multimedia-related tasks; semantic contexts; semantic relatedness; semantic relationship extraction; tag recommendation; tagged images; text features; triangle inequality; visual contexts; visual features; Atmospheric measurements; Biomedical measurement; Feature extraction; Particle measurements; Probabilistic logic; Semantics; Visualization; Canonical correlation analysis; concept relationships; flickr; tagged images;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2014.2306655
  • Filename
    6742613