Pre-clustering algorithm for anomaly detection and clustering that uses variable size buckets

Author

Sharma, Manish ; Toshniwal, Durga

Author_Institution

Electron. & Comput. Eng., Indian Inst. of Technol. Roorkee, Roorkee, India

fYear

2012

fDate

15-17 March 2012

Firstpage

515

Lastpage

519

Abstract

Clustering is known as grouping of data based on their similarities. This paper introduces an algorithm of k means for clustering of data streams and detection of outliers. The introduced technique for detection of outliers is based on distance as well as on time on which they arrive in the cluster. This paper also takes into account the selection of k centers and variable size of buckets with the help of which space can be effectively utilized during clustering. Most traditional algorithms make clustering a very difficult problem by reducing their quality for a better efficiency. This paper indicates that with a small increase in time you can efficiently cluster the data without much loss of quality of data.

Keywords

pattern clustering; anomaly detection; data grouping; data quality; data streams; outlier detection; preclustering algorithm; variable size buckets; Algorithm design and analysis; Clustering algorithms; Data mining; Heuristic algorithms; Information technology; Intrusion detection; Iris; anomaly detection; boolean data; categorial data; clustering; k means;

fLanguage

English

Publisher

ieee

Conference_Titel

Recent Advances in Information Technology (RAIT), 2012 1st International Conference on

Conference_Location

Dhanbad

Print_ISBN

978-1-4577-0694-3

Type

conf

DOI

10.1109/RAIT.2012.6194613

Filename

6194613