Title of article :
Clustering of temporal gene expression data by regularized spline regression and an energy based similarity measure
Author/Authors :
Zhang، نويسنده , , Weifeng and Liu، نويسنده , , Chao-Chun and Yan، نويسنده , , Hong، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Pages :
8
From page :
3969
To page :
3976
Abstract :
Clustering analysis of temporal gene expression data is widely used to study dynamic biological systems, such as identifying sets of genes that are regulated by the same mechanism. However, most temporal gene expression data often contain noise, missing data points, and non-uniformly sampled time points, which imposes challenges for traditional clustering methods of extracting meaningful information. In this paper, we introduce an improved clustering approach based on the regularized spline regression and an energy based similarity measure. The proposed approach models each gene expression profile as a B-spline expansion, for which the spline coefficients are estimated by regularized least squares scheme on the observed data. To compensate the inadequate information from noisy and short gene expression data, we use its correlated genes as the test set to choose the optimal number of basis and the regularization parameter. We show that this treatment can help to avoid over-fitting. After fitting the continuous representations of gene expression profiles, we use an energy based similarity measure for clustering. The energy based measure can include the temporal information and relative changes of the time series using the first and second derivatives of the time series. We demonstrate that our method is robust to noise and can produce meaningful clustering results.
Keywords :
Spline model , Regularized regression , Energy operator , Temporal gene expression data analysis , Clustering
Journal title :
PATTERN RECOGNITION
Serial Year :
2010
Journal title :
PATTERN RECOGNITION
Record number :
1733832
Link To Document :
بازگشت