DocumentCode
3377323
Title
Determining software trustworthiness in an environmental context
Author
Hurlburt, George ; Voas, J. ; Michael, C.
Author_Institution
Change Index Inc., Tall Timbers, MD, USA
fYear
2011
fDate
20-23 June 2011
Firstpage
1
Lastpage
4
Abstract
A kernel is simply a similarity measure that can be applied to input data in the original representation; for example if the data is originally represented as numbers, then the absolute value of the difference between two numbers can be used as a kernel function. However the learning algorithm itself this algorithm is the kernel machine [4] - never sees the data in its original form. Instead, the algorithm only sees the values of various kernel functions that have been applied to the original data. This decouples the data representation from the learning algorithm itself, and thus allows the same machine learning principles to be applied to a wide variety of data types. The advent of kernel machines [2, 3] greatly simplified learning problems where the input data comes in the form strings. String kernels, which are just similarity measures on strings, can be used to train a kernel machine on string data, simply by replacing its existing kernel function with a string kernel function. There are numerous string kernels, but the emphasis is on efficiency. For example, the well-known Levenshtein distance (a.k.a. the edit-distance between two strings) could be used as the basis of a string kernel, but usually this is not done because the edit distance takes quadratic time to compute. One way to construct linear-time string kernels is to use suffix trees [1], which can be used to obtain measures like the longest common substring of two strings, or to get a measure that is close to the number of common substrings.
Keywords
learning (artificial intelligence); programming environments; Levenshtein distance; data representation; edit distance; kernel machine; learning algorithm; linear-time string kernel; similarity measure; software environment; software trustworthiness; string kernel function; suffix trees; Humans; Kernel; Machine learning; Monitoring; Semantics; Software systems; String Kernels; software environments; trusworthiness;
fLanguage
English
Publisher
ieee
Conference_Titel
Prognostics and Health Management (PHM), 2011 IEEE Conference on
Conference_Location
Montreal, QC
Print_ISBN
978-1-4244-9828-4
Type
conf
DOI
10.1109/ICPHM.2011.6024359
Filename
6024359
Link To Document