DocumentCode :
1762474
Title :
Adaptive Database Schema Design for Multi-Tenant Data Management
Author :
Jiacai Ni ; Guoliang Li ; Lijun Wang ; Jianhua Feng ; Jun Zhang ; Lei Li
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Volume :
26
Issue :
9
fYear :
2014
fDate :
Sept. 2014
Firstpage :
2079
Lastpage :
2093
Abstract :
Multi-tenant data management is a major application of Software as a Service (SaaS). For example, many companies want to outsource their data to a third party that hosts a multi-tenant database system to provide data management services. The multi-tenant database system needs to have high performance, low space requirement, and excellent scalability. One big challenge is devising a high-quality database schema. Independent Tables Shared Instances (ITSI) and Shared Tables Shared instances (STSI) are two state-of-the-art approaches to designing the schema. However, they suffer from some limitations. ITSI has poor scalability since it needs to maintain large numbers of tables. STSI achieves good scalability at the expense of poor performance and high space overhead. Thus, an effective schema design method that addresses these problems is needed. In this paper, we propose an adaptive database schema design method for multi-tenant applications. We trade-off ITSI and STSI and find a balance between them to achieve good scalability and high performance with low space requirement. To this end, we identify the important attributes and use them to generate an appropriate number of base tables. For the remaining attributes, we construct supplementary tables. We discuss how to use the kernel matrix to determine the number of the base tables, apply graph-partitioning algorithms to construct the base tables, and evaluate the importance of attributes using the well-known PageRank algorithm. We propose a cost-based model to adaptively generate the base tables and supplementary tables. Our method has the following advantages. First, our method achieves high scalability. Second, our method achieves high performance and can trade-off the performance and space requirement. Third, our method can be easily applied to existing databases (e.g., MySQL) with minor revisions. Fourth, our method can adapt to any schemas and query workloads including both OLAP and OLTP applications. Experiment- l results on both real and synthetic datasets show that our method achieves high performance and good scalability with low space requirement and outperforms state-of-the-art methods.
Keywords :
cloud computing; database management systems; matrix algebra; outsourcing; query processing; ITSI; MySQL; OLAP application; OLTP application; PageRank algorithm; STSI; SaaS; adaptive base table generation; adaptive database schema design; cost-based model; data outsourcing; graph-partitioning algorithms; high-performance requirement; high-quality database schema; independent tables shared instances; kernel matrix; low-space requirement; multitenant data management; multitenant database system; query workloads; real datasets; scalability issue; shared tables shared instances; software as a service; space overhead; supplementary tables; synthetic datasets; Design methodology; Indexes; Scalability; Servers; Software as a service; Database Applications; Database Management; Information Technology and Systems; SaaS; adaptive schema design; multi-tenant;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2013.94
Filename :
6529069
Link To Document :
بازگشت