Generate, test, and explain: synthesizing regularity exposing attributes in large protein databases

Author

De La Maza, Michael

Author_Institution

Artificial Intelligence Lab., MIT, Cambridge, MA, USA

Volume

5

fYear

1994

fDate

4-7 Jan. 1994

Firstpage

123

Lastpage

132

Abstract

Describes a database mining system that synthesizes regularity-exposing attributes in large protein databases. After processing the primary and secondary structure data, this system discovers an amino acid representation that captures what are thought to be the three most important amino acid characteristics (size, charge, and hydrophobicity) for tertiary structure prediction. A neural network trained using this 16-bit representation achieves a performance accuracy on the secondary structure prediction problem that is comparable to the one achieved by a neural network trained using the standard 24-bit amino acid representation.<>

Keywords

biology computing; explanation; macromolecular configurations; neural nets; proteins; very large databases; 16-bit representation; amino acid representation; charge; database mining system; hydrophobicity; large protein databases; neural network training; performance accuracy; primary structure data processing; regularity-exposing attribute synthesis; secondary structure prediction; size; tertiary structure prediction;

fLanguage

English

Publisher

ieee

Conference_Titel

System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on

Conference_Location

Wailea, HI, USA

Print_ISBN

0-8186-5090-7

Type

conf

DOI

10.1109/HICSS.1994.323559

Filename

323559