• DocumentCode
    240235
  • Title

    Fundamentals of a graph transformation based web data processing system

  • Author

    Imre, Gabor ; Mezei, Gergely

  • Author_Institution
    Dept. of Autom. & Appl. Inf., Budapest Univ. of Technol. & Econ., Budapest, Hungary
  • fYear
    2014
  • fDate
    5-7 Nov. 2014
  • Firstpage
    537
  • Lastpage
    542
  • Abstract
    Internet is the most complex and complete source of information in the history of mankind. The innumerable webpages and the myriads of data providers form a complex, highly heterogeneous, continuously evolving system. This environment demands continuous research of information retrieval. Here we present our contribution: a lightweight, semi-formal approach of web exploration and web data analysis. Our approach focuses on analyzing heterogeneous, semi- or barely structured web data. Complex queries can be performed over multiple heterogeneous data sources; graph transformations can be used to adjust the queries and to analyze the results. This approach integrates the a priori human knowledge, an effective web querying method, the formality of graph transformations and optionally qualitative and quantitative analysis algorithms. We present an illustrative case study performing multi-domain search and data transformations: Stack Overflow users are searched with high contribution in a topic, then the LinkedIn profiles of these users is tried to be matched.
  • Keywords
    Internet; data analysis; graph grammars; query processing; text analysis; LinkedIn profiles; Web data analysis; Web pages; Web querying method; graph transformation based Web data processing system; heterogeneous data sources; information retrieval; information source; lightweight semiformal approach; multidomain search; qualitative analysis algorithms; quantitative analysis algorithms; stack overflow users; Algorithm design and analysis; Arrays; Data processing; LinkedIn; Mashups; Navigation; Semantics; Graph Transformation; Integration; Mashup; Model Driven Development; Pattern Matching; Text Analysis; Web Data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cognitive Infocommunications (CogInfoCom), 2014 5th IEEE Conference on
  • Conference_Location
    Vietri sul Mare
  • Type

    conf

  • DOI
    10.1109/CogInfoCom.2014.7020515
  • Filename
    7020515