DocumentCode :
254820
Title :
Organization of Knowledge Extraction from Big Data Systems
Author :
Mani, G. ; Bari, Nazhad ; Duoduo Liao ; Berkovich, Simon
Author_Institution :
Dept. of Comput. Sci., George Washington Univ., Washington, DC, USA
fYear :
2014
fDate :
4-6 Aug. 2014
Firstpage :
63
Lastpage :
69
Abstract :
Even though some of the present-day technologies provide a number of solutions for handling large amounts of data, the increasing accumulation of data -- also termed as Big Data -- from the Internet such as emails, videos, images, and text as well as the digital data in medicine, genetics, and sensors and wireless devices is demanding efficient organizational and engineering designs. Many forms of digital data such as maps and climate informatics, geospatial attributes such as global positioning coordinates, location information, and directions are represented by text, images, or interactive graphics-videos. A single source may produce various types of data (e.g. a geospatial data source may produce both image-and text-type data). This vast and rich data requires a generic processing mechanism that can adapt to various data types and classify them accordingly. In this paper, we propose a technique to optimize the information processing for on-the-fly clusterization of disorganized and unclassified data from vast number of sources. The technique is based on the fuzzy logic using fault-tolerant indexing with error-correction Golay coding. We present an information processing model and an optimized technique for clustering continuous and complex data streams. We show that this mechanism can efficiently retrieve the sensible information from the underlying data clusters. The main objective of this paper is to introduce a tool for this demanding Big Data processing -- on-the-fly clustering of amorphous data items in data stream mode. Finally, we introduce the parallels between computational models of Big Data processing as well as the information processing of human brain where the human brain can be considered as a Big Data machine.
Keywords :
Big Data; Golay codes; Internet; error correction codes; fault tolerance; fuzzy logic; indexing; knowledge acquisition; pattern clustering; Big Data systems; Internet; amorphous data item on-the-fly clustering; complex data stream clustering; continuous data stream clustering; data clusters; disorganized data on-the-fly clusterization; engineering design; error-correction Golay coding; fault-tolerant indexing; fuzzy logic; generic processing mechanism; human brain information processing; information processing optimization; knowledge extraction organization; organizational design; unclassified data on-the-fly clusterization; Big data; Brain modeling; Computational modeling; Data models; Drugs; Information processing; Organizations; Big Data; Error-correcting codes; Golay coding; autism; brain information-processing model; clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computing for Geospatial Research and Application (COM.Geo), 2014 Fifth International Conference on
Conference_Location :
Washington, DC
Type :
conf
DOI :
10.1109/COM.Geo.2014.6
Filename :
6910122
Link To Document :
بازگشت