DocumentCode
50313
Title
Sparsity Learning Formulations for Mining Time-Varying Data
Author
Rongjian Li ; Wenlu Zhang ; Yao Zhao ; Zhenfeng Zhu ; Shuiwang Ji
Author_Institution
Dept. of Comput. Sci., Old Dominion Univ., Norfolk, VA, USA
Volume
27
Issue
5
fYear
2015
fDate
May 1 2015
Firstpage
1411
Lastpage
1423
Abstract
Traditional clustering and feature selection methods consider the data matrix as static. However, the data matrices evolve smoothly over time in many applications. A simple approach to learn from these time-evolving data matrices is to analyze them separately. Such strategy ignores the time-dependent nature of the underlying data. In this paper, we propose two formulations for evolutionary co-clustering and feature selection based on the fused Lasso regularization. The evolutionary co-clustering formulation is able to identify smoothly varying hidden block structures embedded into the matrices along the temporal dimension. Our formulation is very flexible and allows for imposing smoothness constraints over only one dimension of the data matrices. The evolutionary feature selection formulation can uncover shared features in clustering from time-evolving data matrices. We show that the optimization problems involved are non-convex, non-smooth and non-separable. To compute the solutions efficiently, we develop a two-step procedure that optimizes the objective function iteratively. We evaluate the proposed formulations using the Allen Developing Mouse Brain Atlas data. Results show that our formulations consistently outperform prior methods.
Keywords
data mining; evolutionary computation; feature selection; learning (artificial intelligence); matrix algebra; optimisation; pattern clustering; Allen Developing Mouse Brain Atlas data; evolutionary coclustering formulation; evolutionary feature selection formulation; fused Lasso regularization; optimization problems; shared features; smoothly varying hidden block structures; smoothness constraints; sparsity learning formulation; temporal dimension; time-evolving data matrices; time-varying data mining; Approximation methods; Data mining; Gene expression; Linear programming; Optimization; Sparse matrices; Vectors; Sparsity learning; bioinformatics; co-clustering; feature selection; neuroinformatics; optimization; time-varying data;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2014.2373411
Filename
6963408
Link To Document