DocumentCode :
2844905
Title :
K-ranked covariance based missing values estimation for microarray data classification
Author :
Sehgal, Muhammad Shoaib B ; Gondal, Iqbal ; Dooley, Laurence
Author_Institution :
GSCIT, Monash Univ., Clayton, Vic., Australia
fYear :
2004
fDate :
5-8 Dec. 2004
Firstpage :
274
Lastpage :
279
Abstract :
Microarray data often contains multiple missing genetic expression values that degrade the performance of statistical and machine learning algorithms. This paper presents a K-ranked diagonal covariance-based missing value estimation algorithm (KRCOV) that has demonstrated significantly superior performance compared to the more commonly used K-nearest neighbour (KNN) imputation algorithm when it is applied to estimate missing values of BRCA1, BRCA2 and sporadic genetic mutation samples present in ovarian cancer. Experimental results confirm KRCOV outperformed both KNN and zero imputation techniques in terms of their classification accuracies when used to impute randomly missing values from 1% to 5%. The classifier used for this purpose was the generalized regression neural network. The paper also provides a hypothesis for why KRCOV performs better than KNN not only for bio informatics data but also for other data types having strong correlated values.
Keywords :
biology computing; covariance analysis; data mining; learning (artificial intelligence); neural nets; pattern classification; regression analysis; K-ranked covariance based missing values estimation; data mining; generalized regression neural network; microarray data classification; Bioinformatics; Cancer; Clustering algorithms; DNA; Degradation; Diseases; Genetic mutations; Humans; Machine learning algorithms; Tumors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Hybrid Intelligent Systems, 2004. HIS '04. Fourth International Conference on
Print_ISBN :
0-7695-2291-2
Type :
conf
DOI :
10.1109/ICHIS.2004.67
Filename :
1410016
Link To Document :
بازگشت