Abstract :
This paper describes some preliminary work in the development of a methodology for the classification, organization, and analysis of diagnostic data (both quantitative and qualitative) for the purposes of creating taxa, and subsequently establishing statistical inferences concerning the delineation of high and low risk target populations. Specifically, the methodology encompasses: 1. Numerical taxonomic techniques and Q-mode clustering (with dendrograms) in conjunction with factor analysis methods employing various orthogonal and non-orthogonal rotations, i.e., principal components, varimax, quartimax, quartimin, bi-quartimin, covarimin, and direct quartimin. These clustering techniques are applied to physiological data utilizing both correlation and distance function matrices as the basis for the generation of taxa, and further classifications; 2. Multivariate statistical analysis, including linear discriminant functions, and polynominal regression for analyzing the overall structure of the data, as well as formulating and testing statistical hypotheses concerning the allocation of individuals to existing taxa. The methodology itself has served to focus attention upon the gaps existing in the economic computer processing of large volumes of data, and has given considerable impetus to the development of a compendium of sophisticated computer techniques.