DocumentCode
3002320
Title
Using Ontologies for Measuring Semantic Similarity in Data Warehouse Schema Matching Process
Author
Banek, M. ; Vrdoljak, B. ; Tjoa, A.M.
Author_Institution
Univ. of Zagreb, Zagreb
fYear
2007
fDate
13-15 June 2007
Firstpage
227
Lastpage
234
Abstract
The key step of data warehouse integration is the construction of mappings that link mutually compatible components of data warehouse schemas: dimensions, aggregation levels, attributes and facts. In order to perform the integration process in a semi-automated manner, we must define similarity functions that compare the names and substructures of those structure elements. During the last decade, many approaches to measuring semantic similarity between lexical terms have been introduced, most of them based either on the taxonomy of WordNet, a large lexical and thesaurus database of English language, or on the previously measured language statistic corpus. This paper presents a novel semantic similarity technique, based on edge counting, which combines WordNet and domain ontologies written in OWL and is implemented as a Java software. Ontologies are designed by domain experts and thus provide a better and more trustworthy source for calculating similarity, and the fact that the terms are related closer than in WordNet results in a higher similarity.
Keywords
data warehouses; knowledge representation languages; ontologies (artificial intelligence); string matching; text analysis; thesauri; English language thesaurus database; Java software; OWL; WordNet taxonomy; data warehouse integration; domain ontologies; heterogeneous data warehouse schema matching process; lexical terms; semantic similarity measure; Data warehouses; Databases; Java; Natural languages; OWL; Ontologies; Software design; Statistics; Taxonomy; Thesauri;
fLanguage
English
Publisher
ieee
Conference_Titel
Telecommunications, 2007. ConTel 2007. 9th International Conference on
Conference_Location
Zagreb
Print_ISBN
953-184-111-X
Electronic_ISBN
953-184-111-X
Type
conf
DOI
10.1109/CONTEL.2007.381876
Filename
4267502
Link To Document