Title :
XML-Based Knowledge Discovery for the Linguistic Atlas of Sicily (ALS) Project
Author :
Russo, Giuseppe ; Gentile, Antonio ; Pirrone, Roberto ; Cannella, Vincenzo
Author_Institution :
Univ.´´ degli Studi Palermo, Palermo
Abstract :
The identification of new useful patterns in data is a core process for intelligent systems. Information overflow is directly related to this problem. In this work we propose a knowledge discovery methodology to retrieve useful and novel information from raw data stored in a DBMS. We used ALSDB, a database that has been built suitably to access structured information obtained from the questionnaires produced in the Linguistic Atlas of Sicily (ALS) project. The ALS project is a decennal joint effort led by researchers at the Dipartimento di Scienze Filologiche e Linguistiche of the University of Palermo that has the purpose to track and study the geo-linguistic and lexicographic processes about the function and usage of the Sicilian dialect.The main goal of the work described in this paper is to develop an information retrieval methodology that incorporates the directions of linguistic investigation embedded into the ALS questionnaire into a querying tool abstracting away from the intricacies of SQL or XML query constructs. We do this setting up a methodology and data retrieval tool that is scalable and general enough to allow, firstly, evaluation of linguistics´ hypotheses about regional language and dialect evolution in space and time, and, secondly, to help discover new directions of investigation.This works presents the process of knowledge discovery. Starting from conceptualization of few basic ideas, concepts have been extracted from the DBMS through an XML-based mapping and used as building blocks for further investigations. The interaction with users is very intuitive, and the results are incrementally and automatically proposed to the researchers, who may determine to use them as new knowledge to maintain for further use or discard them.
Keywords :
XML; data mining; database management systems; linguistics; query processing; ALSDB database; DBMS; Linguistic Atlas of Sicily project; Sicilian dialect; XML-based knowledge discovery; XML-based mapping; data pattern identification; data retrieval; geo-linguistic process; information overflow; information retrieval; intelligent systems; lexicographic process; querying tool; structured information; Buildings; Competitive intelligence; Data mining; Information retrieval; Intelligent systems; Iterative methods; Ontologies; Relational databases; Software systems; XML; ALS project; Knowledge Discovery; XML mapping; automatic query; relational database;
Conference_Titel :
Complex, Intelligent and Software Intensive Systems, 2009. CISIS '09. International Conference on
Conference_Location :
Fukuoka
Print_ISBN :
978-1-4244-3569-2
Electronic_ISBN :
978-0-7695-3575-3
DOI :
10.1109/CISIS.2009.151