DocumentCode
2068547
Title
Detecting anomalies in symbolic sequence dataset
Author
Wang, Xin ; Xu, Yaxi
Author_Institution
Coll. of Comput. Sci., Civil Aviation Flight Univ. of China, Guanghan, China
fYear
2011
fDate
16-18 Dec. 2011
Firstpage
443
Lastpage
447
Abstract
This paper presents an anomaly detection technique for symbolic sequences based on support vector data description (SVDD). It introduces the longest common subsequence (LCS) distance metric to compute pairwise similarity between sequences. We extend the kernel method to the analysis of variable-length sequences by embedding the LCS distance into the form of Gaussian function to develop a novel Gaussian LCS kernel. By using this kernel, SVDD can directly handle input sequences of variable length, and make good use of the sequential information of sequences. The performance of the proposed technique was compared with SVDD with Gaussian RBF kernel and spectrum kernel. Experimental results show that this technique is better than other techniques in achieving higher detection rate and lower false positive rate.
Keywords
Gaussian processes; sequences; support vector machines; symbol manipulation; Gaussian LCS kernel; Gaussian RBF kernel; Gaussian function; anomaly detection technique; kernel method; longest common subsequence distance metric; pairwise similarity; spectrum kernel; support vector data description; symbolic sequence dataset; variable-length sequences analysis; Aircraft; Kernel; Measurement; Proteins; Support vector machines; Training data; Vectors; anomaly detection; kernel method; longest common subsequence; support vector data description;
fLanguage
English
Publisher
ieee
Conference_Titel
Transportation, Mechanical, and Electrical Engineering (TMEE), 2011 International Conference on
Conference_Location
Changchun
Print_ISBN
978-1-4577-1700-0
Type
conf
DOI
10.1109/TMEE.2011.6199237
Filename
6199237
Link To Document