Title :
Capturing knowledge through top-down induction of decision trees
Author_Institution :
Dept. of Comput. Sci., Wollongong Univ., NSW, Australia
fDate :
6/1/1990 12:00:00 AM
Abstract :
TDIDT (top-down induction of decision trees) methods for heuristic rule generation lead to unnecessarily complex representations of induced knowledge and are overly sensitive to noise in training data. Practical alternatives to TDIDT approaches which lead to more direct representations of the same knowledge, are examined. The alternatives are more immune to problems with spurious correlations in small data sets and to noise in initial training data. These knowledge representation problems and alternatives are examined in the context of chess, for which a TDIDT algorithm called the ID3 algorithm was originally devised. Modifications to the ID3 algorithm are proposed so that users can measure heuristically the information content of attributes to guide search. The program iteratively examines all positive instances remaining to be covered, along with negative training-set instances; search does not take place with irrelevant context restrictions. This algorithm is no more complex than TDIDT, just as fast and less sensitive to noise and it leads to clearer representations of the information present in training-set data.<>
Keywords :
decision support systems; heuristic programming; knowledge acquisition; knowledge representation; search problems; trees (mathematics); ID3; TDIDT algorithm; TDIDT approaches; attributes; chess; context restrictions; decision trees; direct representations; heuristic rule generation; induced knowledge; information content; knowledge representation problems; negative training-set instances; positive instances; search; small data sets; spurious correlations; top-down induction; training data; training-set data; Control systems; Costs; Data engineering; Decision trees; Encoding; Expert systems; Knowledge engineering; Testing; Time factors; Training data;
Journal_Title :
IEEE Expert