DocumentCode
3307777
Title
SAKU: A distributed system for data analysis in large-scale dataset based on cloud computing
Author
Lei Qin ; Bin Wu ; Qing Ke ; Yuxiao Dong
Author_Institution
Sch. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
Volume
2
fYear
2011
fDate
26-28 July 2011
Firstpage
1257
Lastpage
1261
Abstract
Data analysis has been widely used in the enterprises for its high efficiency and accuracy, especially in the field of telecommunication industry, such as User Behavior Analysis, Customer Churn Prediction, etc. However, as the exponential growth of data, traditional data analysis tools can not handle such large-scale dataset. Furthermore, as business gets more and more complicated, there is much more requirement for integration of different data analysis tools. On the other hand, traditional analysis tools lack of visualization, which makes the result hard to understand. We propose a distributed system named SAKU, which resolves those problems. In this paper, we implement some algorithms using mapreduce framework in order to process large-scale data. We also discuss every part of the system. Furthermore, we come up with a new report framework based on cloud computing for visualization of largescale data. The most important thing is, we apply this system into a scenario which meets real-world requirements by using a large volume of data obtained from the telecom operators, which demonstrates high efficiency and scalability of the system.
Keywords
cloud computing; data analysis; data visualisation; distributed databases; very large databases; SAKU; business; cloud computing; data analysis tools; data visualization; distributed system; large-scale dataset; mapreduce; telecom operators; telecommunication industry; Algorithm design and analysis; Business; Clustering algorithms; Data analysis; Data mining; Telecommunications; cloud computing; distributed system; large-scale dataset; mapreduce; report;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-61284-180-9
Type
conf
DOI
10.1109/FSKD.2011.6019711
Filename
6019711
Link To Document