Title :
A machine learning adaptive approach to remove impurities over Bigdata
Author_Institution :
Comput. Sci. Eng., Nat. Inst. of Technol., Jalandhar, India
Abstract :
A Bigdata is the vast information storage collected from various locations and sources. Bigdata is defined as centralized repository with a standard structural specification. But the information driven from various sources are not always appropriate for this structure. This kind of information suffers from number of associated impurities. These impurities include incompleteness, duplicate information, lack of association between dataset attributes etc. To represent this information in organized and structured form, there is the requirement of some algorithmic approach that can identify these impurities and accept the validated data. In this present work, a two stage mode is defined under machine learning approach to transformed unstructured data to structured form. In first stage of this model, a fuzzy based model is defined to analyze this user data. The analysis is performed here under the impurity type analysis and the association analysis. The fuzzy rule is implied here to identify the degree of impurity and the associativity. Once the analysis is performed, the final stage of work is the transformation approach. During this stage, the transformation of this unstructured data to structured data is performed. An ontology driven work is defined to define such mapping. The mapping is here performed under the domain constructs and the data constructs. The work is implemented in java environment. The obtained results from system shows the reliable and robust information mapping so that the effective information tracking over the dataset is obtained.
Keywords :
Big Data; Java; fuzzy set theory; learning (artificial intelligence); ontologies (artificial intelligence); storage management; Big Data; Java environment; algorithmic approach; association analysis; associativity degree; centralized repository; dataset attributes; duplicate information; fuzzy based model; fuzzy rule; impurities identification; impurities removal; impurity degree; impurity type analysis; incompleteness impurities; information mapping; information storage; information tracking; machine learning adaptive approach; ontology driven work; standard structural specification; structured form; unstructured data transformation; user data; Analytical models; Big data; File systems; Impurities; Machine learning algorithms; Reliability; Servers; Bigdata; Fuzzy Effective; Impurities; machine learning; structured analysis;
Conference_Titel :
Electronics,Communication and Computational Engineering (ICECCE), 2014 International Conference on
DOI :
10.1109/ICECCE.2014.7086616