DocumentCode
166016
Title
Projected clustering with subset selection
Author
Babu, Anoop S. ; Kaimal, M.R.
Author_Institution
Dept. of Comput. Sci. & Eng., Amrita Vishwa Vidhyapeetham, Kollam, India
fYear
2014
fDate
24-27 Sept. 2014
Firstpage
1452
Lastpage
1457
Abstract
It has always been a major challenge to cluster high dimensional data considering the inherent sparsity of data-points. Our model uses attribute selection and handles the sparse structure of the data effectively. The subset section is done by two different methods. In first method, we select the subset which has most informative attributes that do preserve cluster structure using LASSO (Least Absolute Selection and Shrinkage Operator). Though there are other methods for attribute selection, LASSO has distinctive properties that it selects the most correlated set of attributes of the data. In second method, we select the subset of linearly independent attributes using QR factorization. This model also identifies dominant attributes of each cluster which retain their predictive power as well. The quality of the projected clusters formed, is also assured with the use of LASSO.
Keywords
data structures; matrix decomposition; pattern clustering; LASSO; QR factorization; attribute selection; high dimensional data; inherent data-point sparsity; least absolute selection and shrinkage operator; predictive power; projected clustering; sparse data structure; subset selection; Indexes; LASSO; QR factorization; attribute relevance index; attribute selection; penalized regression; projected clustering; sparsity problem;
fLanguage
English
Publisher
ieee
Conference_Titel
Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on
Conference_Location
New Delhi
Print_ISBN
978-1-4799-3078-4
Type
conf
DOI
10.1109/ICACCI.2014.6968334
Filename
6968334
Link To Document