• DocumentCode
    2496734
  • Title

    Clustering remote RDF data using SPARQL update queries

  • Author

    Letao Qi ; Lin, H.T. ; Honavar, V.

  • Author_Institution
    Dept. of Comput. Sci., Iowa State Univ., Ames, IA, USA
  • fYear
    2013
  • fDate
    8-12 April 2013
  • Firstpage
    236
  • Lastpage
    242
  • Abstract
    The emergence of large and distributed RDF data in the Linked Open Data cloud calls for approaches to extract useful knowledge using machine learning techniques such as clustering. However, the massive size and remote nature of RDF data hinder traditional approaches that gather the datasets onto a centralized location for analysis. In this work, we show how to implement two representative clustering algorithms using update queries against the SPARQL endpoint of the RDF store. We compare the time complexity and the communication complexity of our algorithms with of those that require direct centralized access to the data and hence have to retrieve the entire RDF dataset from the remote location. We conduct experiments on a real social network dataset and report our preliminary findings.
  • Keywords
    SQL; cloud computing; learning (artificial intelligence); pattern clustering; SPARQL update queries; clustering algorithm; communication complexity; linked open data cloud; machine learning technique; remote RDF data clustering; time complexity; Algorithm design and analysis; Clustering algorithms; Communities; Complexity theory; Prototypes; Resource description framework; Social network services;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering Workshops (ICDEW), 2013 IEEE 29th International Conference on
  • Conference_Location
    Brisbane, QLD
  • Print_ISBN
    978-1-4673-5303-8
  • Electronic_ISBN
    978-1-4673-5302-1
  • Type

    conf

  • DOI
    10.1109/ICDEW.2013.6547456
  • Filename
    6547456