• DocumentCode
    2349943
  • Title

    Learning Structure and Schemas from Heterogeneous Domains in Networked Systems: A Survey

  • Author

    Biba, Marenglen ; Xhafa, Fatos

  • Author_Institution
    Dept. of Comput. Sci., Univ. of New York Tirana, Tirana, Albania
  • fYear
    2010
  • fDate
    24-26 Nov. 2010
  • Firstpage
    222
  • Lastpage
    229
  • Abstract
    The rapidly growing amount of available digital documents of various formats and the possibility to access these through internet-based technologies in distributed environments, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Specifically, the extremely large size of document collections make it impossible to manually organize such documents. Additionally, most of the document sexist in an unstructured form and do not follow any schemas. Therefore, research efforts in this direction are being dedicated to automatically infer structure and schemas. This is essential in order to better organize huge collections as well as to effectively and efficiently retrieve documents in heterogeneous domains in networked system. This paper presents a survey of the state-of-the-art methods for inferring structure from documents and schemas in networked environments. The survey is organized around the most important application domains, namely, bio-informatics, sensor networks, social networks, P2Psystems, automation and control, transportation and privacy preserving for which we analyze the recent developments on dealing with unstructured data in such domains.
  • Keywords
    Internet; data mining; digital libraries; document handling; learning (artificial intelligence); data mining; digital libraries; heterogeneous domains; internet-based technologies; machine learning; networked systems; repositories; structure documents; data mining; distributed systems; heterogeneous data; machine learning; structure learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Networking and Collaborative Systems (INCOS), 2010 2nd International Conference on
  • Conference_Location
    Thessaloniki
  • Print_ISBN
    978-1-4244-8828-5
  • Electronic_ISBN
    978-1-4244-4278-2
  • Type

    conf

  • DOI
    10.1109/INCOS.2010.63
  • Filename
    5702099