DocumentCode :
3408924
Title :
Selection of patient samples and genes for outcome prediction
Author :
Liu, Huiqing ; Li, Jinyan ; Wong, Limsoon
Author_Institution :
Inst. for Infocomm Res., Singapore, Singapore
fYear :
2004
fDate :
16-19 Aug. 2004
Firstpage :
382
Lastpage :
392
Abstract :
Gene expression profiles with clinical outcome data enable monitoring of disease progression and prediction of patient survival at the molecular level. We present a new computational method for outcome prediction. Our idea is to use an informative subset of original training samples. This subset consists of only short-term survivors who died within a short period and long-term survivors who were still alive after a long follow-up time. These extreme training samples yield a clear platform to identify genes whose expression is related to survival. To find relevant genes, we combine two feature selection methods - entropy measure and Wilcoxon rank sum test - so that a set of sharp discriminating features are identified. The selected training samples and genes are then integrated by a support vector machine to build a prediction model, by which each validation sample is assigned a survival/relapse risk score for drawing Kaplan-Meier survival curves. We apply this method to two data sets: diffuse large-B-cell lymphoma (DLBCL) and primary lung adenocarcinoma. In both cases, patients in high and low risk groups stratified by our risk scores are clearly distinguishable. We also compare our risk scores to some clinical factors, such as International Prognostic Index score for DLBCL analysis and tumor stage information for lung adenocarcinoma. Our results indicate that gene expression profiles combined with carefully chosen learning algorithms can predict patient survival for certain diseases.
Keywords :
cancer; cellular biophysics; entropy; genetics; learning (artificial intelligence); lung; medical computing; molecular biophysics; patient monitoring; physiological models; support vector machines; tumours; International Prognostic Index score; Kaplan-Meier survival curves; Wilcoxon rank sum test; clinical outcome prediction; diffuse large-B-cell lymphoma; disease progression monitoring; entropy; feature selection methods; gene expression profiles; genes selection; learning algorithms; patient sample selection; patient survival; prediction model; primary lung adenocarcinoma; support vector machine; survival/relapse risk score; tumor stage information; Diseases; Entropy; Gene expression; Information analysis; Lungs; Patient monitoring; Predictive models; Risk analysis; Support vector machines; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
Print_ISBN :
0-7695-2194-0
Type :
conf
DOI :
10.1109/CSB.2004.1332451
Filename :
1332451
Link To Document :
بازگشت