Title :
Extracting latent structures in numerical classification: an investigation using two factor models
Author :
Choudhury, Arindum ; Ong, YewSoon ; Keane, A.J.
Author_Institution :
Computational Eng. & Design Center, Southampton Univ., UK
Abstract :
We investigate the use of SVD based two factor models for numerical data classification. Motivations for such a study include the widespread success of such models (e.g, LSI) in textual information retrieval, emerging connections with well established statistical techniques and the increasing occurrence of mixed mode (text-and-numeric) data. A direct extension as well as an efficient modification of the LSI model applied to numerical data problems are presented and the associated problems and likely remedies discussed. The techniques under investigation are shown to perform competitively with respect to popular existing numerical classification techniques on a range of synthetic and real world benchmark data. In particular, we show that the modified LSI proposed in this work avoids confronting the optimal subspace selection problem yet generalizes well and remains computationally efficient for large data.
Keywords :
classification; indexing; singular value decomposition; LSI model; SVD; data modeling; information retrieval; numerical data classification; semantic indexing; similarity metric; Data engineering; Data mining; Design engineering; Filtering; Indexing; Information retrieval; Large scale integration; Multidimensional systems; Numerical models; Predictive models;
Conference_Titel :
Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on
Print_ISBN :
981-04-7524-1
DOI :
10.1109/ICONIP.2002.1198992