Title :
Form driven web source integration
Author :
Saissi, Yasser ; Zellou, Ahmed ; Idri, Ali
Author_Institution :
ENSIAS, Mohammed V-Souissi Univ., Rabat, Morocco
Abstract :
The web sources contain a huge amount of data that we need to integrate and to use. The integration of the web source requires to know its source description. In general, the web sources contain a structured data like HTML form and HTML table. This paper proposes our approach to extract a relational schema, describing the web source, and using its structured information. The key idea underlying our approach is to extract the relational data structure of the HTML forms contained in the web source. And thanks to the features of the HTML form, the relational data structure extracted will not only describe the HTML form but also the web source associated. After, we propose to query the HTML forms extracted to generate interesting HTML table results. The data structure of the resulting HTML tables will be used to enhance the relational data structure of the associated HTML form. Finally, with all the relational data structure extracted from all the HTML forms and HTML tables, we build the relational schema describing the associated web source.
Keywords :
Internet; data structures; hypermedia markup languages; Form driven Web source integration; HTML table; relational data structure extraction; relational schema; source description; structured information; Context; Data mining; Databases; Google; HTML; Mediation; Web pages; Web source; Web source integration; structured data;
Conference_Titel :
Intelligent Systems: Theories and Applications (SITA-14), 2014 9th International Conference on
Conference_Location :
Rabat
Print_ISBN :
978-1-4799-3566-6
DOI :
10.1109/SITA.2014.6847288