DocumentCode
1586568
Title
Applying Bayesian classification to protein structure
Author
Hunter, Lawrence ; States, David J.
Author_Institution
Nat. Libr. of Med., Bethesda, MD, USA
fYear
1991
Firstpage
10
Lastpage
16
Abstract
A report is given on the advantages of Bayesian classification over traditional methods and the challenges in applying the Autoclass III program, a heuristic Bayesian classifier, in the domain of biotechnology and protein structure classification. The machine learning technique of heuristic Bayesian classification specifically addresses the question of how many classes a dataset should be divided into, as well as what the classifications should be. The method is based on a minimal message length description of the dataset. The cost (in bits) of specifying a classification is added to the cost of accounting for each exemplar in terms of its distance from the class definition and the total cost is minimized. In addition to providing a well founded estimate of the number of classes necessary to optimally characterize a dataset, this method also generates test classifications where within-class variances differ significantly
Keywords
Bayes methods; classification; data structures; learning systems; medical computing; proteins; Autoclass III program; Bayesian classification; biotechnology; class definition; dataset; heuristic Bayesian classifier; machine learning technique; minimal message length description; protein structure classification; test classifications; within-class variances; Amino acids; Bayesian methods; Biotechnology; Crystallography; Databases; Libraries; Machine learning algorithms; Proteins; Sequences; Shape;
fLanguage
English
Publisher
ieee
Conference_Titel
Artificial Intelligence Applications, 1991. Proceedings., Seventh IEEE Conference on
Conference_Location
Miami Beach, FL
Print_ISBN
0-8186-2135-4
Type
conf
DOI
10.1109/CAIA.1991.120838
Filename
120838
Link To Document