• DocumentCode
    2971336
  • Title

    Disambiguating Keyword Queries on RDF Databases Using "Deep" Segmentation

  • Author

    Fu, Haizhou ; Gao, Sidan ; Anyanwu, Kemafor

  • Author_Institution
    Dept. of Comput. Sci., North Carolina State Univ., Raleigh, NC, USA
  • fYear
    2010
  • fDate
    22-24 Sept. 2010
  • Firstpage
    236
  • Lastpage
    243
  • Abstract
    Keyword search on (semi)structured databases is an increasingly popular research topic. But existing techniques do not deal well with the problems presented by the queries that are ambiguous. Recent approaches for RDF databases try to improve the quality of results by introducing an explicit top-k “interpretation” phase in which queries are translated into an ordered list of “most likely intended” structured (SPARQL) queries before query execution. However, even these recent techniques only address keyword query ambiguity in a limited fashion by identifying fine-grained semantic units or segments of a query. This enables some reduction in the space of interpretations, pruning away incorrect interpretations, but the reduction in interpretation space is not as aggressive as it could be. In this paper, we propose a “deep segmentation” technique for keyword queries issued against an RDF database. This approach achieves a more aggressive pruning of irrelevant interpretations from the space of interpretations considered and therefore produces better quality query interpretations even in the presence of significant query ambiguity. We present results for a comprehensive human-based evaluation that is based on a metric that we introduce called degree of ambiguity (DOTA) that has not been considered by previous efforts. The experimental results show that our approach outperforms existing techniques in terms of quality even when queries are very ambiguous.
  • Keywords
    database management systems; query processing; DOTA; RDF databases; SPARQL queries; deep segmentation; degree of ambiguity; disambiguating keyword queries; keyword search; query execution; semistructured databases; Arrays; Books; Complexity theory; Databases; Joining processes; Resource description framework; Springs; Interpretation; Keyword Query; RDF; Segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing (ICSC), 2010 IEEE Fourth International Conference on
  • Conference_Location
    Pittsburgh, PA
  • Print_ISBN
    978-1-4244-7912-2
  • Electronic_ISBN
    978-0-7695-4154-9
  • Type

    conf

  • DOI
    10.1109/ICSC.2010.90
  • Filename
    5629254