Title :
Outlier detection under interval and fuzzy uncertainty: algorithmic solvability and computational complexity
Author :
Kreinovich, Vladik ; Patangay, Praveen ; Longpré, Luc ; Starks, Scott A. ; Campos, Cynthia ; Ferson, Scott ; Ginzburg, Lev
Author_Institution :
NASA Pan-American Center for Earth & Environ. Studies, Texas Univ., El Paso, TX, USA
Abstract :
In many application areas, it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some "normal" values x1,..., xn, compute the sample average E, the sample standard variation σ, and then mark a value x as an outlier if x is outside the k0-sigma interval [E-k0·σ, E+k0·σ] (for some pre-selected parameter k0). In real life, we often have only interval ranges [xi, x~i] for the normal values x1,...,xn. In this case, we only have intervals of possible values for the bounds E-k0·σ and E+k0·σ. We can therefore identify outliers as values that are outside all k0-sigma intervals. In this paper, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). We also provide algorithms that estimate the degree of "outlier-ness" of a given value x-measured as the largest value k0 for which x is outside the corresponding k0-sigma interval.
Keywords :
computability; computational complexity; data analysis; fuzzy set theory; statistical analysis; uncertainty handling; algorithmic solvability; computational complexity; degree of outlierness; fuzzy uncertainty; interval uncertainty; outlier detection; Computational complexity; Diseases; Earth; Fault detection; Geophysical measurements; Geophysics; Medical tests; Minerals; NASA; Uncertainty;
Conference_Titel :
Fuzzy Information Processing Society, 2003. NAFIPS 2003. 22nd International Conference of the North American
Print_ISBN :
0-7803-7918-7
DOI :
10.1109/NAFIPS.2003.1226818