DocumentCode :
3165364
Title :
Depth-Based Novelty Detection and Its Application to Taxonomic Research
Author :
Chen, Yixin ; Bart, H.L. ; Dang, Xin ; Peng, Hanxiang
Author_Institution :
Univ. of Mississippi, Hattiesburg
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
113
Lastpage :
122
Abstract :
It is estimated that less than 10 percent of the world\´s species have been described, yet species are being lost daily due to human destruction of natural habitats. The job of describing the earth\´s remaining species is exacerbated by the shrinking number of practicing taxonomists and the very slow pace of traditional taxonomic research. In this article, we tackle, from a novelty detection perspective, one of the most important and challenging research objectives in taxonomy - new species identification. We propose a unique and efficient novelty detection framework based on statistical depth functions. Statistical depth functions provide from the "deepest" point a "center-outward ordering" of multidimensional data. In this sense, they can detect observations that appear extreme relative to the rest of the observations, i.e., novelty. Of the various statistical depths, the spatial depth is especially appealing because of its computational efficiency and mathematical tractability. We propose a novel statistical depth, the kernelized spatial depth (KSD) that generalizes the spatial depth via positive definite kernels. By choosing a proper kernel, the KSD can capture the local structure of a data set while the spatial depth fails. Observations with depth values less than a threshold are declared as novel. The proposed algorithm is simple in structure: the threshold is the only one parameter for a given kernel. We give an upper bound on the false alarm probability of a depth-based detector, which can be used to determine the threshold. Experimental study demonstrates its excellent potential in new species discovery.
Keywords :
biology computing; data mining; learning (artificial intelligence); probability; zoology; center-outward ordering; data mining; depth-based detector; depth-based novelty detection; false alarm probability; kernelized spatial depth; machine learning; mathematical tractability; multidimensional data; new species identification; statistical depth functions; taxonomic research; Character recognition; Data mining; Detectors; Earth; Humans; Kernel; Machine learning; Support vector machines; Taxonomy; USA Councils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.10
Filename :
4470235
Link To Document :
بازگشت