Title :
Estimation of missing values in clinical laboratory measurements of ICU patients using a weighted K-nearest neighbors algorithm
Author :
Abdala, O.T. ; Saeed, M.
Author_Institution :
Massachusetts Inst. of Technol., Cambridge, MA, USA
Abstract :
In the modern intensive care unit (ICU), the physiologic state of critically-ill patients is monitored through a diverse array of biosensors and laboratory measurements. The sheer volume of data that is collected has overwhelmed clinicians charged with assimilating and transforming the data into clinical hypotheses. The development of automated algorithms with vigilant monitoring and clinical decision-support capabilities would help to alleviate this "information-overload" challenge. The inherent noise and measurement error is an added level of complication to the real-time analysis and interpretation of medical data. One class of "noise" in medical data can be characterized by the absence or unavailability of a desired measurement. We have analyzed a large collection of clinical laboratory data (blood chemistry, blood gasses, complete blood counts) from over 600 ICU/CCU patients in the MIMIC II database. An analysis of the frequency of missing data values across patient records for each measurement was completed. Furthermore, we have developed a novel method to estimate the values of missing data by the use of a weighted K-nearest neighbors algorithm. We propose a weighting scheme that exploits the correlation between a "missing" dimension and available data values from other fields. We compare our technique with several popular missing value estimation techniques: principal components analysis, least squares estimation, mean imputation, and classical k-nearest neighbors. The mean standardized imputation error ranges from a minimum of 0.31 to a maximum, of 0.75 depending on the imputed dimension. The mean standardized imputation error over all dimensions is 0.45.
Keywords :
biochemistry; biosensors; blood; decision support systems; diseases; least squares approximations; measurement errors; medical computing; patient care; patient monitoring; ICU patient; MIMIC II database; automated algorithm; biosensor; blood chemistry; blood count; blood gas; clinical laboratory measurement; critically-ill patient monitoring; decision-support capability; inherent noise; intensive care unit; least squares estimation; mean imputation; measurement error; missing value estimation; patient record; principal component analysis; real-time analysis; vigilant monitoring; weighted K-nearest neighbors algorithm; Biomedical monitoring; Biosensors; Blood; Chemistry; Computerized monitoring; Laboratories; Measurement errors; Noise level; Noise measurement; Patient monitoring;
Conference_Titel :
Computers in Cardiology, 2004
Print_ISBN :
0-7803-8927-1
DOI :
10.1109/CIC.2004.1443033