Author_Institution :
Coll. of Comput. Sci. & Technol., Nanjing Univ. of Aeronaut. & Astronaut., Nanjing, China
Abstract :
As an important resource and productive element, big data permeates all the domains, such as: E-commerce, traffic management or smart city. When possessing the capability of aggregating the information and then mining and analyzing deeply the latent knowledge, it will bring endless innovative achievements. Therefore, big data mining and deep analytics is becoming one of the research hotspots, and has attracted more and more attention from academia, industry as well as government. However, because of the "3Vs (volume, velocity and variety)" characters of the big data, there is no single tool or a one-size-fits-all solution for big data processing. This paper reports our own experiences in building a cloud-based big data mining & analyzing services platform by integrating R for providing rich data statistical and analytic functions. The architecture of the services platform is discussed in details, which includes four layers: infrastructure layer, virtualization layer, dataset processing layer and services layer. Following the whole architecture, the implementation of K-Means algorithm service is introduced as an example. Finally, we propose the conclusion, and explore the research directions in the future.
Keywords :
Big Data; cloud computing; data analysis; data mining; 3Vs charaacters; Big Data processing; analytic functions; cloud-based Big Data mining; data statistical function; dataset processing layer; deep analytics; e-commerce; infrastructure layer; k-means algorithm; services layer; services platform integrating R analysis; smart city; traffic management; virtualization layer; volume, velocity and variety characters; Cloud computing; Computer architecture; Data mining; Educational institutions; Programming; Virtual machining; K-Mean clustering; RHadoop; big data; cloud computing; data mining and analyzing;