DocumentCode
2080430
Title
Intensional associations in dataspaces
Author
Salles, Marcos Antonio Vaz ; Dittrich, Jens ; Blunschi, Lukas
Author_Institution
Cornell Univ., Ithaca, NY, USA
fYear
2010
fDate
1-6 March 2010
Firstpage
984
Lastpage
987
Abstract
Dataspace applications necessitate the creation of associations among data items over time. For example, once information about people is extracted from sources on the Web, associations among them may emerge as a consequence of different criteria, such as their city of origin or their elected hobbies. In this paper, we advocate a declarative approach to specifying these associations. We propose that each set of associations be defined by an association trail. An association trail is a query-based definition of how items are connected by intensional (i.e., virtual) association edges to other items in the dataspace. We study the problem of processing neighborhood queries over such intensional association graphs. The naive approach to neighborhood query processing over intensional graphs is to materialize the whole graph and then apply previous work on dataspace graph indexing to answer queries. We present in this paper a novel indexing technique, the grouping-compressed index (GCI), that has better worst-case indexing cost than the naive approach. In our experiments, GCI is shown to provide an order of magnitude gain in indexing cost over the naive approach, while remaining competitive in query processing time.
Keywords
data mining; database indexing; database management systems; query processing; Web; association trail; dataspace graph indexing; dataspaces applications; grouping compressed index; indexing technique; information extraction; intensional association edges; intensional association graphs; intensional associations; magnitude gain; naive approach; neighborhood query processing; query processing time; worst case indexing cost; Cities and towns; Content management; Costs; Data mining; Indexing; Information management; Query processing; Universal Serial Bus;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Engineering (ICDE), 2010 IEEE 26th International Conference on
Conference_Location
Long Beach, CA
Print_ISBN
978-1-4244-5445-7
Electronic_ISBN
978-1-4244-5444-0
Type
conf
DOI
10.1109/ICDE.2010.5447833
Filename
5447833
Link To Document