DocumentCode
2349943
Title
Learning Structure and Schemas from Heterogeneous Domains in Networked Systems: A Survey
Author
Biba, Marenglen ; Xhafa, Fatos
Author_Institution
Dept. of Comput. Sci., Univ. of New York Tirana, Tirana, Albania
fYear
2010
fDate
24-26 Nov. 2010
Firstpage
222
Lastpage
229
Abstract
The rapidly growing amount of available digital documents of various formats and the possibility to access these through internet-based technologies in distributed environments, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Specifically, the extremely large size of document collections make it impossible to manually organize such documents. Additionally, most of the document sexist in an unstructured form and do not follow any schemas. Therefore, research efforts in this direction are being dedicated to automatically infer structure and schemas. This is essential in order to better organize huge collections as well as to effectively and efficiently retrieve documents in heterogeneous domains in networked system. This paper presents a survey of the state-of-the-art methods for inferring structure from documents and schemas in networked environments. The survey is organized around the most important application domains, namely, bio-informatics, sensor networks, social networks, P2Psystems, automation and control, transportation and privacy preserving for which we analyze the recent developments on dealing with unstructured data in such domains.
Keywords
Internet; data mining; digital libraries; document handling; learning (artificial intelligence); data mining; digital libraries; heterogeneous domains; internet-based technologies; machine learning; networked systems; repositories; structure documents; data mining; distributed systems; heterogeneous data; machine learning; structure learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Networking and Collaborative Systems (INCOS), 2010 2nd International Conference on
Conference_Location
Thessaloniki
Print_ISBN
978-1-4244-8828-5
Electronic_ISBN
978-1-4244-4278-2
Type
conf
DOI
10.1109/INCOS.2010.63
Filename
5702099
Link To Document