Title :
Supporting statistics in extensible databases: a case study
Author :
Segev, Arie ; Chatterjee, Abhirup
Abstract :
This paper presents a framework for supporting regression in extensible databases. This work is motivated by an actual case study that required probabilistic record matching using regression be supported in a database. We discuss how the regression function can be implemented in a database using rules and functions. We present some indexing and caching ideas for efficient data processing for retrieval-intensive applications. We also describe how the regression coefficients can be incrementally updated by materializing some of the intermediate statistical results. Finally, we illustrate how sampling techniques can be used to generate the data sets for regression and highlight some of the performance issues in this regard
Keywords :
buffer storage; database management systems; indexing; performance evaluation; statistical analysis; caching; case study; data processing; data set generation; extensible databases; functions; indexing; performance evaluation; probabilistic record matching; regression; regression coefficients; regression function; retrieval-intensive applications; rules; sampling techniques; statistical results; statistics; Computer aided software engineering; Data processing; Databases; Indexing; Information retrieval; Laboratories; Packaging; Sampling methods; Statistical analysis; Statistics;
Conference_Titel :
Scientific and Statistical Database Management, 1994. Proceedings., Seventh International Working Conference on
Conference_Location :
Charlottesville, VA
Print_ISBN :
0-8186-6610-2
DOI :
10.1109/SSDM.1994.336962