DocumentCode :
2276117
Title :
Using Wavelets to Classify Documents
Author :
Xexeo, Geraldo ; de Souza, Jano ; Castro, Patrícia F. ; Pinheiro, Wallace A.
Author_Institution :
Programa de Eng. de Sist. e Comput., COPPE/UFRJ, Rio de Janeiro
Volume :
1
fYear :
2008
fDate :
9-12 Dec. 2008
Firstpage :
272
Lastpage :
278
Abstract :
Currently, Fourier and cosine discrete transformations are used to classify documents. This article proposes a new strategy that uses wavelets in the representation and reduction of data text. Wavelets have been extensively used for dimensionality reduction in the field of signal processing. In this work, we show that a text document, after being subjected to a simple process of reorganization of its terms, can be treated like a signal and analyzed by signal processing tools. We demonstrate that this new representation is able to describe the most relevant features of documents in a synthetic representation and this new perspective improves the performance of the classification algorithm.
Keywords :
Fourier transforms; classification; discrete cosine transforms; text analysis; wavelet transforms; Fourier transformations; cosine discrete transformations; documents classification; signal processing; text document; wavelet transforms; Classification algorithms; Discrete wavelet transforms; Intelligent agent; Military computing; Multiresolution analysis; Signal analysis; Signal processing; Signal processing algorithms; Text categorization; Wavelet transforms; Classification; KNN; Wavelets;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology, 2008. WI-IAT '08. IEEE/WIC/ACM International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-0-7695-3496-1
Type :
conf
DOI :
10.1109/WIIAT.2008.221
Filename :
4740460
Link To Document :
بازگشت