Title :
A novel learning method to classify data streams in the internet of things
Author :
Khan, Muhammad Asad ; Khan, Ajmal ; Khan, M.N. ; Anwar, Sohel
Author_Institution :
Dept. of Inf. Technol., Univ. of Haripur, Haripur, Pakistan
Abstract :
Data streams are high volume of multi-dimensional unlabeled data generated in environments such as stock market, astronomical data, Weblogs, Click streams, Flood, Fire and Crops monitoring. Knowledge discovery in data streams is valuable task for research, business and community. The fundamental step of knowledge discovery in data stream is the classification of the data streams in target classes. In this paper we have proposed classification mechanism for the data streams, conventional classification algorithm are of little significance in data streams due to the complex nature, unbounded memory requirements and concept drifting problem in data streams. The proposed method takes a novel approach towards the classification of the data streams through applying unsupervised classification techniques such as clustering followed by supervised classifier such as Support Vector Machine. The high volume data is sampled and reduced with Simple Aggregation and Approximation (SAX) Density based clustering algorithm DB Scan is applied on the data stream to reveal the number of classes present and subsequently label the data. Support vector Machine (SVM) is a well-known and proven supervised classification algorithm, SVM are applied to classify the label data. We tested our proposed method on the Intel Lab Data set, a data set of four environmental variables (Temperature, Voltage, Humidity, light) collected through 54 Mica2Dot sensors over 36 Days at per second rate. We have sampled the data stream in days and window of certain size n is trained on the SVM classifier. The algorithm is evaluated on different test size and average accuracy of 80% is obtained.
Keywords :
Internet of Things; data mining; pattern classification; support vector machines; unsupervised learning; DB Scan; Intel Lab Data set; Internet of things; Mica2Dot sensors; SAX density based clustering algorithm; SVM classifier; data stream classification; drifting problem; environmental variables; humidity; knowledge discovery; learning method; light; memory requirements; simple aggregation and approximation density based clustering algorithm; supervised classification algorithm; supervised classifier; support vector machine; temperature; unsupervised classification techniques; voltage; Hardware; Monitoring; Support vector machines; Density based Clustering; Machine Learning; Supervised Learning; Unsupervised learning;
Conference_Titel :
Software Engineering Conference (NSEC), 2014 National
Conference_Location :
Rawalpindi
Print_ISBN :
978-1-4799-6161-0
DOI :
10.1109/NSEC.2014.6998242