Empirical modeling of very large data sets using neural networks

Author

Owens, Aaron J.

Author_Institution

DuPont Central Res. & Dev., Wilmington, DE, USA

Volume

6

fYear

2000

fDate

2000

Firstpage

302

Abstract

Building empirical predictive models from very large data sets is challenging. One has to deal both with the `curse of dimensionality´ (hundreds or thousands of variables) and with `too many records´ (many thousands of instances). While neural networks [Rumelhart, et al., 1986] are widely recognized as universal function approximators [Cybenko, 1989], their training time rapidly increases with the number of variables and instances. I discuss practical methods for overcoming this problem so that neural network models can be developed for very large databases. The methods include: Dimensionality reduction with neural net modeling, PLS modeling, and bottleneck neural networks; Sub-sampling and re-sampling with many smaller data sets to reduce training time; Committee of networks to make the final prediction more robust and to estimate its uncertainty

Keywords

database theory; learning (artificial intelligence); neural nets; very large databases; PLS modeling; bottleneck neural networks; committee of networks; dimensionality reduction; neural network models; neural networks; predictive models; universal function approximators; very large data sets; very large databases; Arithmetic; Artificial neural networks; Databases; Feedforward neural networks; Input variables; Neural networks; Predictive models; Research and development; Robustness; Uncertainty;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on

Conference_Location

Como

ISSN

1098-7576

Print_ISBN

0-7695-0619-4

Type

conf

DOI

10.1109/IJCNN.2000.859413

Filename

859413