DocumentCode :
2080430
Title :
Intensional associations in dataspaces
Author :
Salles, Marcos Antonio Vaz ; Dittrich, Jens ; Blunschi, Lukas
Author_Institution :
Cornell Univ., Ithaca, NY, USA
fYear :
2010
fDate :
1-6 March 2010
Firstpage :
984
Lastpage :
987
Abstract :
Dataspace applications necessitate the creation of associations among data items over time. For example, once information about people is extracted from sources on the Web, associations among them may emerge as a consequence of different criteria, such as their city of origin or their elected hobbies. In this paper, we advocate a declarative approach to specifying these associations. We propose that each set of associations be defined by an association trail. An association trail is a query-based definition of how items are connected by intensional (i.e., virtual) association edges to other items in the dataspace. We study the problem of processing neighborhood queries over such intensional association graphs. The naive approach to neighborhood query processing over intensional graphs is to materialize the whole graph and then apply previous work on dataspace graph indexing to answer queries. We present in this paper a novel indexing technique, the grouping-compressed index (GCI), that has better worst-case indexing cost than the naive approach. In our experiments, GCI is shown to provide an order of magnitude gain in indexing cost over the naive approach, while remaining competitive in query processing time.
Keywords :
data mining; database indexing; database management systems; query processing; Web; association trail; dataspace graph indexing; dataspaces applications; grouping compressed index; indexing technique; information extraction; intensional association edges; intensional association graphs; intensional associations; magnitude gain; naive approach; neighborhood query processing; query processing time; worst case indexing cost; Cities and towns; Content management; Costs; Data mining; Indexing; Information management; Query processing; Universal Serial Bus;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2010 IEEE 26th International Conference on
Conference_Location :
Long Beach, CA
Print_ISBN :
978-1-4244-5445-7
Electronic_ISBN :
978-1-4244-5444-0
Type :
conf
DOI :
10.1109/ICDE.2010.5447833
Filename :
5447833
Link To Document :
بازگشت