DocumentCode :
3625485
Title :
Histogram-Based Dimensionality Reduction of Term Vector Space
Author :
Krzysztof Ciesielski;Mieczyslaw A. Klopotek;Slawomir T. Wierzchon
Author_Institution :
Polish Academy of Sciences, Poland
fYear :
2007
fDate :
6/1/2007 12:00:00 AM
Firstpage :
103
Lastpage :
108
Abstract :
One of the most vital problems of free-text document processing is the curse of dimensionality. The paper presents a dimensionality reduction algorithm based on informed feature selection. Terms describing the document are based on histogram-like statistics which can be computed as well as incrementally updated at low complexity. The document representation can adapt to changing document collection characteristics. Along with the fundamental concepts we present an empirical verification of the approach.
Keywords :
"Principal component analysis","Matrix decomposition","Sparse matrices","Karhunen-Loeve transforms","Frequency","Multidimensional systems","Computer science","Statistics","Discrete transforms","Image analysis"
Publisher :
ieee
Conference_Titel :
Computer Information Systems and Industrial Management Applications, 2007. CISIM ´07. 6th International Conference on
Print_ISBN :
0-7695-2894-5
Type :
conf
DOI :
10.1109/CISIM.2007.35
Filename :
4273504
Link To Document :
بازگشت