Author_Institution :
Dept. of Comput. Sci. & Inf. Manage., SooChow Univ., Taipei, Taiwan
Abstract :
The term big data has been the most popular topic in recent years in practice, academe and government for realizing the value of data. Then, many information technologies and software are proposed to deal with big data, such as Hadoop, NoSQL databases, and cloud computing. However, these tools can only help us to store, manage, search, and control data rather than extract knowledge from big data. The only way to mine the nugget from big data is to have the ability to analyze them. The characteristics of complexity of big data, e.g., Volume and variety make traditional data mining algorithms invalid. In this paper, we deal with big data by solving distributed and high-dimensional problems. We propose a novel algorithm to effectively extract knowledge from big data. According to the empirical study, the propose method can handle big data soundly.
Keywords :
Big Data; data mining; distributed processing; genetic algorithms; Big Data; Hadoop; NoSQL database; cloud computing; data mining algorithm; distributed data; distributed problem; genetic programming; government; high-dimensional data; high-dimensional problem; information technology; knowledge extraction; Algorithm design and analysis; Big data; Data mining; Data models; Distributed databases; Feature extraction; Genetic programming; big data; distributed problems; genetic programming; high-dimensional problems; knowledge extraction;