Determining software trustworthiness in an environmental context

Author

Hurlburt, George ; Voas, J. ; Michael, C.

Author_Institution

Change Index Inc., Tall Timbers, MD, USA

fYear

2011

fDate

20-23 June 2011

Firstpage

1

Lastpage

4

Abstract

A kernel is simply a similarity measure that can be applied to input data in the original representation; for example if the data is originally represented as numbers, then the absolute value of the difference between two numbers can be used as a kernel function. However the learning algorithm itself this algorithm is the kernel machine [4] - never sees the data in its original form. Instead, the algorithm only sees the values of various kernel functions that have been applied to the original data. This decouples the data representation from the learning algorithm itself, and thus allows the same machine learning principles to be applied to a wide variety of data types. The advent of kernel machines [2, 3] greatly simplified learning problems where the input data comes in the form strings. String kernels, which are just similarity measures on strings, can be used to train a kernel machine on string data, simply by replacing its existing kernel function with a string kernel function. There are numerous string kernels, but the emphasis is on efficiency. For example, the well-known Levenshtein distance (a.k.a. the edit-distance between two strings) could be used as the basis of a string kernel, but usually this is not done because the edit distance takes quadratic time to compute. One way to construct linear-time string kernels is to use suffix trees [1], which can be used to obtain measures like the longest common substring of two strings, or to get a measure that is close to the number of common substrings.

Keywords

learning (artificial intelligence); programming environments; Levenshtein distance; data representation; edit distance; kernel machine; learning algorithm; linear-time string kernel; similarity measure; software environment; software trustworthiness; string kernel function; suffix trees; Humans; Kernel; Machine learning; Monitoring; Semantics; Software systems; String Kernels; software environments; trusworthiness;

fLanguage

English

Publisher

ieee

Conference_Titel

Prognostics and Health Management (PHM), 2011 IEEE Conference on

Conference_Location

Montreal, QC

Print_ISBN

978-1-4244-9828-4

Type

conf

DOI

10.1109/ICPHM.2011.6024359

Filename

6024359