Title :
Detecting anomalies in symbolic sequence dataset
Author :
Wang, Xin ; Xu, Yaxi
Author_Institution :
Coll. of Comput. Sci., Civil Aviation Flight Univ. of China, Guanghan, China
Abstract :
This paper presents an anomaly detection technique for symbolic sequences based on support vector data description (SVDD). It introduces the longest common subsequence (LCS) distance metric to compute pairwise similarity between sequences. We extend the kernel method to the analysis of variable-length sequences by embedding the LCS distance into the form of Gaussian function to develop a novel Gaussian LCS kernel. By using this kernel, SVDD can directly handle input sequences of variable length, and make good use of the sequential information of sequences. The performance of the proposed technique was compared with SVDD with Gaussian RBF kernel and spectrum kernel. Experimental results show that this technique is better than other techniques in achieving higher detection rate and lower false positive rate.
Keywords :
Gaussian processes; sequences; support vector machines; symbol manipulation; Gaussian LCS kernel; Gaussian RBF kernel; Gaussian function; anomaly detection technique; kernel method; longest common subsequence distance metric; pairwise similarity; spectrum kernel; support vector data description; symbolic sequence dataset; variable-length sequences analysis; Aircraft; Kernel; Measurement; Proteins; Support vector machines; Training data; Vectors; anomaly detection; kernel method; longest common subsequence; support vector data description;
Conference_Titel :
Transportation, Mechanical, and Electrical Engineering (TMEE), 2011 International Conference on
Conference_Location :
Changchun
Print_ISBN :
978-1-4577-1700-0
DOI :
10.1109/TMEE.2011.6199237