• Title of article

    Exploring dictionary-based semantic relatedness in labeled tree data

  • Author/Authors

    Andrea Tagarelli، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2013
  • Pages
    25
  • From page
    244
  • To page
    268
  • Abstract
    The increase in the volume and heterogeneity of semistructured data based application scenarios has demanded for next-generation methods that are able to effectively couple syntactic with semantic information in data management and mining tasks. The focus of this paper is on the development of methods for determining semantic relatedness in tree-shaped semistructured data and on the assessment of the impact of these methods on structural sense ranking in such data. By exploiting key features of a lexical knowledge base like WordNet, namely ontological relations and concept definitions, we propose a twofold approach that takes into account the particular form of labeled tree data as a conceptual hierarchical representation of real-world objects. We infer indirect relationships between tag concepts and exploit an interleaved search through different concept hierarchies in order to extend semantic relatedness measures originally conceived for plain-text data to deal with labeled tree data instances. We also develop a structural sense ranking framework which employs a context graph built on the tag concepts and the structural relations among tags in the tree data. Experimental evidence on a large real-world collection of Wikipedia articles has shown that the proposed methods can effectively detect and maximize semantic relatedness in tree-structured data, and can be profitably used to perform structural sense ranking.
  • Keywords
    Semantic relatedness measures , Tree-shaped semistructured data and XML , Structural sense ranking , PageRank , wordnet
  • Journal title
    Information Sciences
  • Serial Year
    2013
  • Journal title
    Information Sciences
  • Record number

    1215296